unsloth/Qwen2.5-0.5B-Instruct

Warm
Public
0.5B
BF16
32768
License: apache-2.0
Hugging Face
Overview

Overview

unsloth/Qwen2.5-0.5B-Instruct is a 0.49 billion parameter instruction-tuned causal language model from the Qwen2.5 series, developed by Qwen. It is built on a transformer architecture utilizing RoPE, SwiGLU, RMSNorm, Attention QKV bias, and tied word embeddings, with a full context length of 32,768 tokens and generation capability up to 8,192 tokens. This model represents an advancement over Qwen2, focusing on enhanced performance across several key areas.

Key Capabilities

  • Expanded Knowledge & Skills: Significantly improved capabilities in coding and mathematics, leveraging specialized expert models.
  • Instruction Following: Enhanced ability to follow instructions and generate structured outputs, particularly JSON.
  • Long Text Generation: Improved performance in generating texts over 8,000 tokens.
  • Structured Data Understanding: Better comprehension of structured data, such as tables.
  • Multilingual Support: Supports over 29 languages, including Chinese, English, French, Spanish, Portuguese, German, Italian, Russian, Japanese, Korean, Vietnamese, Thai, and Arabic.
  • Robustness: More resilient to diverse system prompts, aiding in role-play and chatbot condition-setting.

Good For

  • Applications requiring strong instruction following and structured output generation.
  • Tasks involving coding and mathematical reasoning, especially in a multilingual context.
  • Chatbot implementations needing robust role-play and prompt resilience.
  • Scenarios demanding long-context understanding and generation up to 8K tokens.