Overview
The Auroic Router 0.6B, developed by Kaushal Krishna, is a specialized 0.8 billion parameter model built on Qwen3-0.6B. Its core function is to act as a conversational AI pipeline's front-end, analyzing recent chat context to make structured routing decisions. This allows downstream models to be activated only when necessary, optimizing resource usage.
Key Capabilities
- Intent Classification: Processes a 5-message history window and up to 3 unprocessed candidate messages to determine the appropriate action.
- Structured Output: Generates a single routing decision in a defined format, specifying
TYPE (text, react, media, ignore), TARGET message, and additional fields like EFFORT or TITLE. - Low Latency: Designed for edge deployment, achieving routing decisions in under 4 seconds on CPU, with warm inference times of ~2-3 seconds.
- Contextual Reasoning: Utilizes Qwen3's native 'thinking' capability for ambiguous inputs, reasoning through context before deciding, while skipping it for obvious cases to maintain low latency.
- Hinglish Optimization: Specifically fine-tuned on a dataset of 9,300 samples of Indian group chat scenarios, with a language distribution of 58.8% Hinglish, 23% English, and 18.2% mixed.
Use Cases
- Conversational AI Routing: Ideal for directing messages in group chats to specific AI modules (e.g., a text generation model for advice, a media generator for reactions).
- Resource Optimization: Reduces computational load by ensuring that more complex AI models are only invoked when a specific intent is detected.
- Real-time Chat Processing: Suitable for applications requiring rapid, structured responses to chat inputs, particularly in Indian language contexts.
Limitations
- Optimized for Hinglish and Indian English group chat patterns.
- Stateless by design; requires a fresh context for every call.
- Not intended for general-purpose tasks like factual QA, coding assistance, or long-form content generation.