rikunarita/Qwen3-4B-Thinking-2507-Genius-Coder
rikunarita/Qwen3-4B-Thinking-2507-Genius-Coder is a 4.0 billion parameter causal language model, fine-tuned by rikunarita from Qwen3-4B-Thinking-2507 using the TeichAI/gpt-5.1-codex-max-1000x dataset. This model is specifically enhanced for complex reasoning tasks, coding, and agentic use, featuring a native context length of 262,144 tokens. It demonstrates significantly improved performance across logical reasoning, mathematics, science, and coding benchmarks, making it suitable for applications requiring deep analytical capabilities.
Loading preview...
Model Overview
rikunarita/Qwen3-4B-Thinking-2507-Genius-Coder is a 4.0 billion parameter causal language model, fine-tuned from the Qwen3-4B-Thinking-2507 base model. It was trained using Unsloth and Huggingface's TRL library on the TeichAI/gpt-5.1-codex-max-1000x dataset, resulting in a model optimized for complex problem-solving.
Key Enhancements & Capabilities
This model features several significant improvements over its predecessor:
- Enhanced Reasoning: Demonstrates significantly improved performance on tasks requiring logical reasoning, mathematics, science, and academic benchmarks.
- Coding Prowess: Shows marked improvements in coding benchmarks like LiveCodeBench and CFEval.
- Extended Context: Supports an impressive native context length of 262,144 tokens, crucial for handling highly complex reasoning tasks.
- Agentic Abilities: Excels in tool calling capabilities, with recommendations to use Qwen-Agent for optimal performance.
- General Capabilities: Better instruction following, tool usage, and text generation.
Performance Highlights
The model shows strong performance across various categories, with notable scores in:
- Reasoning: AIME25 (81.3), HMMT25 (55.5)
- Coding: LiveCodeBench v6 (55.2), CFEval (1852)
- Alignment: IFEval (87.4), Creative Writing v3 (75.6)
Recommended Use Cases
This model is particularly well-suited for:
- Highly Complex Reasoning Tasks: Its increased "thinking length" and improved reasoning capabilities make it ideal for problems requiring deep analytical thought.
- Code Generation and Analysis: Strong performance in coding benchmarks suggests its utility for developer-centric applications.
- Agentic Applications: Designed to excel with tool-calling frameworks like Qwen-Agent, enabling sophisticated automated workflows.
- Long-Context Understanding: Its 256K context window is beneficial for processing and generating extensive documents or complex multi-turn conversations.