kz919/QwQ-0.5B-Distilled-SFT
The kz919/QwQ-0.5B-Distilled-SFT model, developed by Kaizhao Liang, is a 0.5 billion parameter causal language model distilled from Qwen/Qwen2-0.5B-Instruct using Qwen/QwQ-32B-Preview as a teacher. It features a 131072-token context length and is specifically fine-tuned for conversational AI tasks requiring step-by-step reasoning. This model is optimized for efficient deployment on edge devices while maintaining strong problem-solving capabilities.
Loading preview...
Model Overview
The kz919/QwQ-0.5B-Distilled-SFT is a 0.5 billion parameter causal language model developed by Kaizhao Liang. It is a distilled version of the Qwen/Qwen2-0.5B-Instruct base model, leveraging Qwen/QwQ-32B-Preview as the teacher model through an instruction tuning framework. This distillation process aims to transfer the reasoning capabilities of a larger model into a smaller, more efficient footprint.
Key Capabilities
- Step-by-step Reasoning: Designed to provide detailed, logical thought processes in its responses.
- Long Context Understanding: Trained on the
QwQ-LongCoT-130Kdataset, which includes long-context examples for reasoning and conversational AI tasks. - Efficient Deployment: At 0.5 billion parameters, it's suitable for applications requiring robust conversational AI in resource-constrained environments.
- Conversational AI: Optimized for generating coherent and contextually aware responses in chat-based interactions.
Training Details
The model was trained using a GKD (Generative Knowledge Distillation) framework to align its predictions with the high-quality outputs of the teacher model. It incorporates gradient checkpointing for efficient training and was trained on a 90/10 split of the amphora/QwQ-LongCoT-130K dataset.
Good For
- Conversational Assistants: Ideal for chatbots that need reasoning and long-context understanding.
- Educational Tools: Can provide step-by-step explanations, making it useful for learning.
- Technical Support: Capable of handling complex queries with precision.
Limitations
While efficient, its performance on highly complex reasoning tasks may not fully match that of its larger teacher model. The developer notes it is still a proof of concept and may occasionally produce nonsensical output.