AryanNsc/qwen3-0.6b-tool-router

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:0.8BQuant:BF16Ctx Length:32kPublished:Dec 15, 2025License:apache-2.0Architecture:Transformer Open Weights Warm

AryanNsc/qwen3-0.6b-tool-router is a 0.6 billion parameter Small Language Model (SLM) derived from Qwen3-0.6B, specifically optimized for low-latency, schema-strict tool and function routing. This model is designed to act as a deterministic router in agentic systems, reliably mapping natural language queries to structured tool calls. It excels in resource-constrained edge environments due to its small size, low memory footprint, and fast cold start, making it suitable for on-device inference.

Loading preview...

Overview of AryanNsc/qwen3-0.6b-tool-router

This model is a verticalized Small Language Model (SLM), built upon Qwen3-0.6B, and uniquely specialized for tool and function routing. Unlike general-purpose language models, its core function is to serve as a deterministic router within agentic systems, ensuring precise mapping of natural language inputs to structured tool calls.

Key Capabilities & Properties

  • Model Size: A compact 0.6 billion parameters, ideal for efficiency.
  • Strict JSON Output: Engineered to produce machine-consumable JSON, crucial for reliable tool invocation.
  • Low Latency & Memory: Optimized for rapid processing and minimal memory usage, supporting edge-device inference.
  • No Chain-of-Thought: Designed without CoT to reduce token count and parsing overhead, enhancing speed.
  • Fast Cold Start: Enables quick deployment and responsiveness in on-device or near-device applications.

Performance Highlights

Evaluated using BFCL metrics, the model demonstrates strong performance in key areas:

  • Multi-Turn Base: Achieves 90.42%
  • Relevance Detection: Scores 90.89%
  • Non-Live Parallel AST: Reaches 83.50%

Ideal Use Cases

This model is particularly well-suited for scenarios demanding efficiency and reliability in tool calling, especially in:

  • On-device assistants and local agent routers.
  • Offline-capable systems where connectivity is limited.
  • Privacy-sensitive deployments requiring local processing.