adurani/Qwen2.5-0.5B-Instruct
adurani/Qwen2.5-0.5B-Instruct is a 0.49 billion parameter instruction-tuned causal language model from the Qwen2.5 series, developed by Qwen Team. It features a 32,768 token context length and is significantly improved in coding, mathematics, instruction following, and generating structured outputs like JSON. This model is optimized for multilingual support across over 29 languages and enhanced resilience to diverse system prompts for chatbot applications.
Loading preview...
Qwen2.5-0.5B-Instruct Overview
This model is the instruction-tuned 0.5 billion parameter variant from the Qwen2.5 series, developed by the Qwen Team. It builds upon the Qwen2 architecture with substantial enhancements across several key areas. The model utilizes a transformer architecture with RoPE, SwiGLU, RMSNorm, Attention QKV bias, and tied word embeddings, supporting a full context length of 32,768 tokens and generating up to 8,192 tokens.
Key Capabilities
- Enhanced Knowledge & Reasoning: Significantly improved in coding and mathematics due to specialized expert models.
- Instruction Following: Demonstrates notable advancements in adhering to instructions and generating long texts (over 8K tokens).
- Structured Data Handling: Excels at understanding structured data, such as tables, and generating structured outputs, particularly JSON.
- Multilingual Support: Offers robust support for over 29 languages, including Chinese, English, French, Spanish, and more.
- Chatbot Resilience: More resilient to diverse system prompts, improving role-play implementation and condition-setting for chatbots.
Good For
- Applications requiring strong instruction following and structured output generation.
- Multilingual chatbots and assistants needing broad language support.
- Tasks involving coding, mathematics, and long-text generation in a compact model size.