Model Overview
rikunarita/Qwen3-4B-Thinking-2507-Genius-Coder is a 4.0 billion parameter causal language model, fine-tuned from the Qwen3-4B-Thinking-2507 base model. It was trained using Unsloth and Huggingface's TRL library on the TeichAI/gpt-5.1-codex-max-1000x dataset, resulting in a model optimized for complex problem-solving.
Key Enhancements & Capabilities
This model features several significant improvements over its predecessor:
- Enhanced Reasoning: Demonstrates significantly improved performance on tasks requiring logical reasoning, mathematics, science, and academic benchmarks.
- Coding Prowess: Shows marked improvements in coding benchmarks like LiveCodeBench and CFEval.
- Extended Context: Supports an impressive native context length of 262,144 tokens, crucial for handling highly complex reasoning tasks.
- Agentic Abilities: Excels in tool calling capabilities, with recommendations to use Qwen-Agent for optimal performance.
- General Capabilities: Better instruction following, tool usage, and text generation.
Performance Highlights
The model shows strong performance across various categories, with notable scores in:
- Reasoning: AIME25 (81.3), HMMT25 (55.5)
- Coding: LiveCodeBench v6 (55.2), CFEval (1852)
- Alignment: IFEval (87.4), Creative Writing v3 (75.6)
Recommended Use Cases
This model is particularly well-suited for:
- Highly Complex Reasoning Tasks: Its increased "thinking length" and improved reasoning capabilities make it ideal for problems requiring deep analytical thought.
- Code Generation and Analysis: Strong performance in coding benchmarks suggests its utility for developer-centric applications.
- Agentic Applications: Designed to excel with tool-calling frameworks like Qwen-Agent, enabling sophisticated automated workflows.
- Long-Context Understanding: Its 256K context window is beneficial for processing and generating extensive documents or complex multi-turn conversations.