Overview
Qwen3-4B-Diversity is a 4 billion parameter language model developed by hadadxyz, fine-tuned from the Qwen/Qwen3-4B base model. It was trained using supervised fine-tuning with parameter-efficient methods over 2 epochs, utilizing an A100-80GB GPU for approximately 17 hours. The model's training data comprises a diverse collection of over 24,000 high-quality reasoning examples distilled from various advanced AI systems, including Kimi K2.5, Qwen3.5, Claude Opus 4.6, Gemini 3 Pro, GPT-5.2, GLM-4.7, GLM-5, DeepSeek V3.2-Speciale, and GPT-5 Codex.
Key Capabilities
- Advanced Reasoning: Excels at breaking down complex problems and providing detailed reasoning processes.
- Mathematical Problem Solving: Enhanced capabilities for mathematical reasoning due to dedicated math-focused datasets.
- Code Generation and Understanding: Improved coding abilities, benefiting from multiple code-reasoning datasets.
- Multi-Turn Conversations: Better handles extended dialogues and maintains context-aware responses.
- Domain Versatility: Offers flexibility across different domains and task types by integrating reasoning patterns from various AI systems.
Good For
- Applications requiring strong logical deduction and problem-solving.
- Tasks involving mathematical calculations and proofs.
- Code generation, debugging, and understanding.
- Building conversational agents that can maintain context over long interactions.