beyoru/EvolLLM
beyoru/EvolLLM is a 4 billion parameter merged language model based on Qwen3-4B-Instruct-2507 and Qwen3-4B-Thinking-2507. Developed by Beyoru, this model is designed as an instruct model, not a reasoning model, and serves as a strong foundation for Supervised Fine-Tuning (SFT) or Generative Reinforcement Learning from Human Feedback (GRPO) training. It shows a 3% improvement over its base instruct model in agent benchmarks and surpasses openfree/Darwin-Qwen3-4B and the base model in ACEBench.
Loading preview...
EvolLLM: A Merged Qwen3-4B Instruct Model
beyoru/EvolLLM is a 4 billion parameter language model created by merging two specialized Qwen3-4B base models: Qwen/Qwen3-4B-Instruct-2507 and Qwen/Qwen3-4B-Thinking-2507. This unique combination aims to leverage the strengths of both instruction-following and thinking-oriented architectures, primarily functioning as an instruct model rather than a dedicated reasoning model.
Key Capabilities
- Instruction Following: Designed specifically for instruction-based tasks, making it suitable for applications requiring direct command execution.
- Strong Foundation for Fine-tuning: Serves as an excellent starting point for further Supervised Fine-Tuning (SFT) or Generative Reinforcement Learning from Human Feedback (GRPO) training.
- Improved Performance: Demonstrates a 3% improvement over its base instruct model in agent benchmarks and outperforms
openfree/Darwin-Qwen3-4B(another evolution model) and the base model in ACEBench evaluations.
Good for
- Developers looking for a robust 4B parameter instruct model.
- Projects requiring a solid base for custom fine-tuning with SFT or GRPO.
- Applications where instruction adherence and general task execution are primary requirements.