painslane/Qwen2-0.5B-Instruct
The painslane/Qwen2-0.5B-Instruct is a 0.5 billion parameter instruction-tuned causal language model from the Qwen2 series, developed by Qwen. Based on the Transformer architecture, it features SwiGLU activation, attention QKV bias, and group query attention, with an improved tokenizer for multilingual and code support. This model is optimized for a broad range of tasks including language understanding, generation, multilingual capabilities, coding, mathematics, and reasoning, demonstrating competitive performance against other open-source models.
Loading preview...
Qwen2-0.5B-Instruct Overview
This model is the instruction-tuned 0.5 billion parameter variant from the new Qwen2 series of large language models, developed by Qwen. Qwen2 models are built on the Transformer architecture, incorporating features like SwiGLU activation, attention QKV bias, and group query attention, alongside an enhanced tokenizer designed for multiple natural languages and code.
Key Capabilities & Performance
Qwen2 models, including this 0.5B instruction-tuned version, have shown strong performance across various benchmarks, often surpassing previous Qwen1.5 models and other open-source alternatives. It is designed for a wide array of tasks, including:
- Language Understanding and Generation
- Multilingual Capabilities
- Coding and Mathematics
- Reasoning
Comparative evaluation against Qwen1.5-0.5B-Chat highlights significant improvements:
- MMLU: 37.9 (vs 35.0)
- HumanEval: 17.1 (vs 9.1)
- GSM8K: 40.1 (vs 11.3)
- C-Eval: 45.2 (vs 37.2)
- IFEval (Prompt Strict-Acc.): 20.0 (vs 14.6)
Training Details
The model was pretrained on a large dataset and further refined using both supervised finetuning and direct preference optimization to enhance its instruction-following abilities.
Use Cases
Given its broad capabilities and improved performance in a compact 0.5B size, this model is suitable for applications requiring efficient language processing, coding assistance, and reasoning, especially where resource constraints are a factor.