iq28/Qwen2.5-3B-Instruct

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:3.1BQuant:BF16Ctx Length:32kPublished:Mar 22, 2026License:qwen-researchArchitecture:Transformer Warm

Qwen2.5-3B-Instruct is a 3.09 billion parameter instruction-tuned causal language model developed by Qwen, part of the latest Qwen2.5 series. This model features a 32,768 token context length and is optimized for enhanced knowledge, coding, and mathematics capabilities. It significantly improves instruction following, long text generation, structured data understanding, and multilingual support across 29 languages.

Loading preview...

Overview

iq28/Qwen2.5-3B-Instruct is an instruction-tuned causal language model from the Qwen2.5 series, developed by Qwen. This specific repository hosts the 3.09 billion parameter version, which is part of a larger family of models ranging from 0.5 to 72 billion parameters. It builds upon the Qwen2 architecture, incorporating transformers with RoPE, SwiGLU, RMSNorm, Attention QKV bias, and tied word embeddings, supporting a substantial context length of 32,768 tokens and generating up to 8,192 tokens.

Key Capabilities

  • Enhanced Knowledge & Specialized Domains: Significantly improved capabilities in coding and mathematics, leveraging specialized expert models.
  • Instruction Following & Output Generation: Offers substantial improvements in instruction following, generating long texts (over 8K tokens), understanding structured data (like tables), and producing structured outputs, particularly JSON.
  • Robustness: More resilient to diverse system prompts, enhancing role-play and condition-setting for chatbots.
  • Multilingual Support: Provides support for over 29 languages, including major global languages such as Chinese, English, French, Spanish, German, Japanese, and Korean.

When to Use This Model

This model is particularly well-suited for applications requiring:

  • Instruction-based tasks: Where precise adherence to instructions is critical.
  • Code and Math-intensive applications: Benefiting from its specialized domain improvements.
  • Long-form content generation: With its ability to generate texts over 8,000 tokens.
  • Multilingual interactions: Supporting a broad range of languages for global applications.
  • Structured data processing: Including understanding tables and generating JSON outputs.