123-cao/Qwen2-0.5B-Instruct

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:0.5BQuant:BF16Ctx Length:32kPublished:Mar 16, 2026License:apache-2.0Architecture:Transformer Open Weights Warm

Qwen2-0.5B-Instruct is a 0.5 billion parameter instruction-tuned causal language model from the Qwen2 series, developed by Qwen. Built on a Transformer architecture with SwiGLU activation and group query attention, it features an improved tokenizer for multilingual and code adaptability. This model demonstrates competitive performance across benchmarks for language understanding, generation, coding, mathematics, and reasoning, making it suitable for a wide range of general-purpose conversational AI applications.

Loading preview...

Qwen2-0.5B-Instruct Overview

Qwen2-0.5B-Instruct is an instruction-tuned model from the new Qwen2 series, developed by Qwen. This 0.5 billion parameter model is part of a larger family that includes various sizes and a Mixture-of-Experts model. It is built upon a Transformer architecture incorporating features like SwiGLU activation, attention QKV bias, and group query attention, alongside an enhanced tokenizer designed for multiple natural languages and code.

Key Capabilities & Performance

Qwen2 models, including this instruction-tuned version, have shown strong performance against other open-source and proprietary models across diverse benchmarks. For the 0.5B-Instruct variant, notable improvements over its predecessor, Qwen1.5-0.5B-Chat, include:

  • MMLU: Improved from 35.0 to 37.9
  • HumanEval (Coding): Significantly increased from 9.1 to 17.1
  • GSM8K (Mathematics): Substantially better, moving from 11.3 to 40.1
  • C-Eval: Enhanced from 37.2 to 45.2

These metrics highlight its enhanced capabilities in reasoning, coding, and general language understanding, making it a versatile choice for various applications.

Training and Usage

The model was pretrained on a large dataset and further refined using both supervised finetuning and direct preference optimization. For quick integration, it requires transformers>=4.37.0 and can be easily loaded and used for text generation with a simple Python snippet, supporting chat templating for structured conversations.