aligokcek1/MyQwen2.5-0.5B

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:0.5BQuant:BF16Ctx Length:32kPublished:May 13, 2026License:apache-2.0Architecture:Transformer Open Weights Warm

The aligokcek1/MyQwen2.5-0.5B model is an instruction-tuned causal language model from the Qwen2.5 series, developed by Qwen Team. With 0.49 billion parameters and a 32,768-token context length, it significantly improves capabilities in coding, mathematics, and instruction following compared to its predecessors. This model excels at generating long texts, understanding structured data like tables, and producing structured outputs such as JSON, while also offering robust multilingual support across 29 languages.

Loading preview...

Qwen2.5-0.5B-Instruct: Enhanced Small-Scale LLM

This model is the instruction-tuned 0.5 billion parameter variant of the Qwen2.5 series, developed by the Qwen Team. It builds upon the Qwen2 architecture, featuring transformers with RoPE, SwiGLU, RMSNorm, Attention QKV bias, and tied word embeddings. With 24 layers and 14 attention heads (GQA), it processes a full context length of 32,768 tokens and can generate up to 8,192 tokens.

Key Capabilities and Improvements

  • Enhanced Knowledge & Reasoning: Significantly improved capabilities in coding and mathematics, leveraging specialized expert models.
  • Instruction Following: Demonstrates substantial advancements in adhering to instructions and generating long, coherent texts (over 8K tokens).
  • Structured Data Handling: Excels at understanding structured data, including tables, and generating structured outputs like JSON.
  • Robust System Prompt Resilience: More resilient to diverse system prompts, enhancing role-play and chatbot condition-setting.
  • Multilingual Support: Offers comprehensive support for over 29 languages, including major global languages like Chinese, English, French, Spanish, German, and Japanese.

Use Cases

This model is well-suited for applications requiring efficient, small-scale language processing with strong instruction following and multilingual capabilities. Its improvements in coding, mathematics, and structured output generation make it valuable for tasks where precise, formatted responses are crucial, even within a constrained parameter budget.