tuandunghcmut/qwen3-4b-planner-v1
TEXT GENERATIONConcurrency Cost:1Model Size:4BQuant:BF16Ctx Length:32kPublished:May 20, 2026License:apache-2.0Architecture:Transformer Open Weights Warm
tuandunghcmut/qwen3-4b-planner-v1 is a 4 billion parameter LoRA fine-tune of Qwen/Qwen3-4B-Instruct-2507, specifically optimized for multi-agent planning and tool-calling. This model excels at generating structured JSON plans and MAS orchestrator YAML, supporting native Hermes-style tool call outputs. It is designed for applications requiring robust, structured planning capabilities from a compact language model.
Loading preview...
Overview
tuandunghcmut/qwen3-4b-planner-v1 is a 4 billion parameter model, fine-tuned using LoRA on the Qwen/Qwen3-4B-Instruct-2507 base model. This model is specialized in multi-agent planning and tool-calling, distinguishing itself through its ability to generate structured outputs for complex task orchestration.
Key Capabilities
- Structured Planning Output: Generates structured JSON plans (e.g.,
{"tasks": [...]}) for short-prompt planning scenarios. - MAS Orchestration: Produces MAS orchestrator YAML (
action: plan|clarify|done) for canonical long-prompt planning styles. - Native Tool-Calling: Supports Hermes-style
<tool_call>output, with arguments as direct JSON objects, avoiding double-encoding issues. - Training Data: Fine-tuned on a diverse 500k-row dataset including
multiformat,toucan_qwen3,nemotron_chat_if,planner_v01_full,duy_vhb_style_json, andnemotron_structured_outputs. - Accessibility: Available as a merged Hugging Face model, a LoRA adapter for integration with the base model, and various GGUF quantizations (f16, q8_0, q4_k_m) for flexible deployment.
Good For
- Agentic Workflows: Ideal for developing applications that require LLMs to break down user requests into actionable, structured plans for multiple agents.
- Tool Use Integration: Excellent for scenarios where precise, natively formatted tool calls are crucial for interacting with external systems or APIs.
- Resource-Constrained Environments: With its 4B parameters and GGUF quantizations, it's suitable for deployment on edge devices or systems with limited computational resources, while still providing specialized planning capabilities.