STEVENZHANG904/Qwen3-0.6B-planner-sft
STEVENZHANG904/Qwen3-0.6B-planner-sft is a 0.8 billion parameter model, SFT-finetuned from Qwen/Qwen3-0.6B. It is specifically designed to act as a 'planner' agent within a multi-agent system, distilled from traces of Qwen3-32B's planning behavior. This model excels at generating planning steps and reasoning blocks, making it suitable for task orchestration in AI pipelines.
Loading preview...
Model Overview
STEVENZHANG904/Qwen3-0.6B-planner-sft is a specialized 0.8 billion parameter language model, fine-tuned from the Qwen/Qwen3-0.6B base model. Its primary function is to serve as a planner agent within multi-agent systems, learning to emulate the planning capabilities of the larger Qwen3-32B model.
Key Capabilities
- Agentic Planning: Distilled to generate planning steps and reasoning blocks, specifically emitting
<think>...</think>structures, similar to its larger teacher model. - Task Orchestration: Designed to interpret task-specific prompts and formulate a sequence of actions or thoughts required for complex tasks.
- Efficient Deployment: As a smaller, distilled model, it offers a more efficient alternative for integrating planning capabilities into applications where the full
Qwen3-32Bmight be too resource-intensive.
Training Details
The model was trained on the planner configuration of the Divij/qwen3-32b-mas-traces dataset, which comprises traces of Qwen3-32B acting as a planner. Training utilized an assistant-only loss function, AdamW optimizer, and a constant learning rate of 1e-5 with 3% warmup, over 10 epochs. It supports a sequence length of 8192 tokens with sequence packing.
Usage Notes
For optimal performance, sampling should be used during inference rather than greedy decoding, as small distilled models can exhibit looping behavior in <think> blocks under greedy decoding. The model expects a task-specific prompt formatted for a planner role.