Jackrong/Qwen3.5-4B-Neo

VISIONConcurrency Cost:1Model Size:4.5BQuant:BF16Ctx Length:32kTool Calling:SupportedPublished:Mar 22, 2026License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

Jackrong/Qwen3.5-4B-Neo is a 4.5 billion parameter language model, fine-tuned from Qwen3.5-4B, specifically optimized for efficient and concise reasoning. It achieves 82.00% pass@1 on a 250-question MMLU-Pro subset, demonstrating improved accuracy and significantly shorter reasoning chain lengths compared to its base model. This model excels in analytical tasks, coding, competitive programming, and complex logic-dependent prompting.

Loading preview...

Qwen3.5-4B-Neo: Efficient Reasoning Model

Jackrong/Qwen3.5-4B-Neo is a 4.5 billion parameter model, fine-tuned from Qwen3.5-4B, with a primary focus on enhancing reasoning efficiency and conciseness. It was developed using Supervised Fine-Tuning (SFT) and LoRA, with a unique response-only training approach that masks on <|im_start|>assistant\n<think> to structure its thought processes.

Key Capabilities & Differentiators

  • Optimized Reasoning: Achieves 82.00% pass@1 on a 250-question MMLU-Pro subset, outperforming the base Qwen3.5-4B (80.40%).
  • Concise Thought Chains: Reduces average think-chain length from 6,962 to 3,955 characters and median length from 4,600 to 1,951 characters, leading to higher efficiency (2.31 correct solutions per 10k think characters vs. 1.03 for base).
  • Structured Thinking: Conditioned to explicitly structure its reasoning within <think>...</think> tags, promoting methodical problem-solving without repetitive thoughts.
  • Specialized Training Data: Trained on high-quality, filtered reasoning distillation data, including stepfun-ai/Step-3.5-Flash-SFT and a custom Jackrong/Competitive-Programming-python-blend for competitive programming and logic.

Intended Use Cases

This model is best suited for:

  • Offline analytical tasks requiring transparent AI logic.
  • Coding and competitive programming.
  • Mathematics and heavy logic-dependent prompting.

Limitations

  • Hallucination Risk: Like other autoregressive LLMs, it may occasionally hallucinate external facts.
  • Context Boundaries: Extremely complex logic can sometimes lead to truncation from excessive circular thinking.

This model is a test version intended for academic research and technical exploration.