Name: distil-labs/distil-qwen3-0.6b-voice-assistant-banking API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: distil-labs

Model Overview

Distil-Qwen3-0.6B-Voice-Assistant-Banking is a compact, 0.6 billion parameter model built on the Qwen3 architecture, specifically fine-tuned by Distil Labs for banking voice assistant applications. It excels at multi-turn intent classification and slot extraction, crucial for robust conversational AI in financial services.

Key Capabilities & Performance

High Accuracy: Achieves an impressive 90.9% tool call accuracy, notably outperforming its 120B parameter teacher model (87.5%) and the base Qwen3-0.6B model (48.7%).
Extreme Efficiency: Despite its small size, it delivers approximately 40ms inference time, making it suitable for real-time voice pipelines with total latencies under 400ms.
Specialized Functionality: Designed to act as a function caller, parsing user utterances (including those with ASR errors) and conversation history to output structured tool calls for 14 specific banking operations.
Knowledge Distillation: Trained using knowledge distillation from a much larger 120B teacher model, allowing it to retain high performance in a significantly smaller footprint.

Ideal Use Cases

Real-time Banking Voice Assistants: Powers full ASR -> SLM -> TTS pipelines for immediate responses.
Text-based Banking Chatbots: Provides structured intent routing for automated customer service.
Edge Deployment: Suitable for on-device voice processing due to its small size and high efficiency.
Multi-turn Tool Calling: Effective for any bounded intent taxonomy requiring accurate function calling based on conversational context.

Overview

Model Overview

Key Capabilities & Performance

Ideal Use Cases

Full Model Card (README)