Jackrong/Qwen3.5-4B-Neo
Jackrong/Qwen3.5-4B-Neo is a 4.5 billion parameter language model, fine-tuned from Qwen3.5-4B, specifically optimized for efficient and concise reasoning. It achieves 82.00% pass@1 on a 250-question MMLU-Pro subset, demonstrating improved accuracy and significantly shorter reasoning chain lengths compared to its base model. This model excels in analytical tasks, coding, competitive programming, and complex logic-dependent prompting.
Loading preview...
Qwen3.5-4B-Neo: Efficient Reasoning Model
Jackrong/Qwen3.5-4B-Neo is a 4.5 billion parameter model, fine-tuned from Qwen3.5-4B, with a primary focus on enhancing reasoning efficiency and conciseness. It was developed using Supervised Fine-Tuning (SFT) and LoRA, with a unique response-only training approach that masks on <|im_start|>assistant\n<think> to structure its thought processes.
Key Capabilities & Differentiators
- Optimized Reasoning: Achieves 82.00% pass@1 on a 250-question MMLU-Pro subset, outperforming the base Qwen3.5-4B (80.40%).
- Concise Thought Chains: Reduces average think-chain length from 6,962 to 3,955 characters and median length from 4,600 to 1,951 characters, leading to higher efficiency (2.31 correct solutions per 10k think characters vs. 1.03 for base).
- Structured Thinking: Conditioned to explicitly structure its reasoning within
<think>...</think>tags, promoting methodical problem-solving without repetitive thoughts. - Specialized Training Data: Trained on high-quality, filtered reasoning distillation data, including
stepfun-ai/Step-3.5-Flash-SFTand a customJackrong/Competitive-Programming-python-blendfor competitive programming and logic.
Intended Use Cases
This model is best suited for:
- Offline analytical tasks requiring transparent AI logic.
- Coding and competitive programming.
- Mathematics and heavy logic-dependent prompting.
Limitations
- Hallucination Risk: Like other autoregressive LLMs, it may occasionally hallucinate external facts.
- Context Boundaries: Extremely complex logic can sometimes lead to truncation from excessive circular thinking.
This model is a test version intended for academic research and technical exploration.