beyoru/EvolLLM

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:4BQuant:BF16Ctx Length:32kPublished:Oct 3, 2025Architecture:Transformer0.0K Warm

beyoru/EvolLLM is a 4 billion parameter merged language model based on Qwen3-4B-Instruct-2507 and Qwen3-4B-Thinking-2507. Developed by Beyoru, this model is designed as an instruct model, not a reasoning model, and serves as a strong foundation for Supervised Fine-Tuning (SFT) or Generative Reinforcement Learning from Human Feedback (GRPO) training. It shows a 3% improvement over its base instruct model in agent benchmarks and surpasses openfree/Darwin-Qwen3-4B and the base model in ACEBench.

Loading preview...

EvolLLM: A Merged Qwen3-4B Instruct Model

beyoru/EvolLLM is a 4 billion parameter language model created by merging two specialized Qwen3-4B base models: Qwen/Qwen3-4B-Instruct-2507 and Qwen/Qwen3-4B-Thinking-2507. This unique combination aims to leverage the strengths of both instruction-following and thinking-oriented architectures, primarily functioning as an instruct model rather than a dedicated reasoning model.

Key Capabilities

  • Instruction Following: Designed specifically for instruction-based tasks, making it suitable for applications requiring direct command execution.
  • Strong Foundation for Fine-tuning: Serves as an excellent starting point for further Supervised Fine-Tuning (SFT) or Generative Reinforcement Learning from Human Feedback (GRPO) training.
  • Improved Performance: Demonstrates a 3% improvement over its base instruct model in agent benchmarks and outperforms openfree/Darwin-Qwen3-4B (another evolution model) and the base model in ACEBench evaluations.

Good for

  • Developers looking for a robust 4B parameter instruct model.
  • Projects requiring a solid base for custom fine-tuning with SFT or GRPO.
  • Applications where instruction adherence and general task execution are primary requirements.