STEVENZHANG904/Qwen3-0.6B-planner-sft

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:0.8BQuant:BF16Ctx Length:32kPublished:May 8, 2026License:apache-2.0Architecture:Transformer Open Weights Warm

STEVENZHANG904/Qwen3-0.6B-planner-sft is a 0.8 billion parameter model, SFT-finetuned from Qwen/Qwen3-0.6B. It is specifically designed to act as a 'planner' agent within a multi-agent system, distilled from traces of Qwen3-32B's planning behavior. This model excels at generating planning steps and reasoning blocks, making it suitable for task orchestration in AI pipelines.

Loading preview...

Model Overview

STEVENZHANG904/Qwen3-0.6B-planner-sft is a specialized 0.8 billion parameter language model, fine-tuned from the Qwen/Qwen3-0.6B base model. Its primary function is to serve as a planner agent within multi-agent systems, learning to emulate the planning capabilities of the larger Qwen3-32B model.

Key Capabilities

  • Agentic Planning: Distilled to generate planning steps and reasoning blocks, specifically emitting <think>...</think> structures, similar to its larger teacher model.
  • Task Orchestration: Designed to interpret task-specific prompts and formulate a sequence of actions or thoughts required for complex tasks.
  • Efficient Deployment: As a smaller, distilled model, it offers a more efficient alternative for integrating planning capabilities into applications where the full Qwen3-32B might be too resource-intensive.

Training Details

The model was trained on the planner configuration of the Divij/qwen3-32b-mas-traces dataset, which comprises traces of Qwen3-32B acting as a planner. Training utilized an assistant-only loss function, AdamW optimizer, and a constant learning rate of 1e-5 with 3% warmup, over 10 epochs. It supports a sequence length of 8192 tokens with sequence packing.

Usage Notes

For optimal performance, sampling should be used during inference rather than greedy decoding, as small distilled models can exhibit looping behavior in <think> blocks under greedy decoding. The model expects a task-specific prompt formatted for a planner role.