taki555/Qwen3-4B-Thinking-2507-Art

TEXT GENERATIONConcurrency Cost:1Model Size:4BQuant:BF16Ctx Length:32kTool Calling:SupportedPublished:Mar 4, 2026License:apache-2.0Architecture:Transformer Open Weights Cold

taki555/Qwen3-4B-Thinking-2507-Art is a 4 billion parameter Qwen3-based causal language model, derived from Qwen3-4B-Thinking-2507. It is specifically optimized for efficient Chain-of-Thought (CoT) reasoning, aiming to provide accurate thinking trajectories with reduced computational overhead. This model excels at maintaining high performance across varying token budgets, making it suitable for reasoning tasks where efficiency is critical.

Loading preview...

Art-Qwen3-4B-Thinking-2507: Efficient Reasoning with Qwen3-4B

This model, Art-Qwen3-4B-Thinking-2507, is a specialized 4 billion parameter variant of the Qwen3-4B-Thinking-2507 model. Its core innovation lies in addressing the computational overhead typically associated with Chain-of-Thought (CoT) reasoning in Large Language Models (LLMs), as detailed in the paper "The Art of Efficient Reasoning: Data, Reward, and Optimization".

Key Capabilities

  • Efficient CoT Reasoning: Optimized to generate short yet accurate thinking trajectories, reducing computational costs while preserving reasoning quality.
  • Two-Stage Training Paradigm: Utilizes length adaptation and reasoning refinement to achieve its efficiency goals.
  • Reward-Shaped Optimization: Employs Reinforcement Learning (RL) with reward shaping to ensure high performance across diverse token budgets, specifically designed to avoid sacrificing accuracy for brevity.
  • Dataset: Trained on the DeepScaleR-Easy dataset, which incentivizes concise and precise reasoning.

Good For

  • Applications requiring efficient and accurate Chain-of-Thought reasoning.
  • Scenarios where computational resources or token budgets are constrained but high reasoning performance is still necessary.
  • Tasks benefiting from optimized thinking trajectories that balance brevity with correctness.