beyoru/EvolLLM
beyoru/EvolLLM is a 4 billion parameter language model created by Beyoru, formed by merging two Qwen3-4B base models: Qwen3-4B-Instruct-2507 and Qwen3-4B-Thinking-2507. This model is designed as an instruct model, not a dedicated reasoning model, and serves as a strong starting point for further Supervised Fine-Tuning (SFT) or Generative Pre-trained Reinforcement Learning (GRPO) training. It features a 40960 token context length and shows a slight improvement in agent benchmarks and surpasses other evolution models like openfree/Darwin-Qwen3-4B in ACEBench.
Loading preview...
EvolLLM: A Merged Qwen3-4B Instruct Model
beyoru/EvolLLM is a 4 billion parameter language model developed by Beyoru, created through a strategic merge of two Qwen3-4B base models: Qwen/Qwen3-4B-Instruct-2507 and Qwen/Qwen3-4B-Thinking-2507. This unique combination aims to leverage the strengths of both instruction-tuned and 'thinking' variants of the Qwen3 architecture.
Key Characteristics & Performance
- Merged Architecture: Combines instruction-following capabilities with elements from a 'thinking' model, offering a balanced foundation.
- Instruction-Oriented: Primarily designed as an instruct model, suitable for tasks requiring direct instruction adherence rather than complex reasoning.
- Evaluation: While not significantly surpassing instruct models in agent benchmarks (only 3% improvement), EvolLLM demonstrates superior performance over
openfree/Darwin-Qwen3-4Band its base models in ACEBench evaluations. - Context Length: Supports a substantial context window of 40960 tokens.
Ideal Use Cases
- Foundation for Fine-tuning: This model is explicitly noted as an excellent starting point for further Supervised Fine-Tuning (SFT) or Generative Pre-trained Reinforcement Learning (GRPO) training.
- Instruction-Following Applications: Suitable for applications where clear instructions need to be followed, benefiting from its instruct model design.
- Experimental Merging: Offers insights into the effectiveness of merging different specialized base models for specific outcomes.