unsloth/Qwen2.5-72B-Instruct

Warm
Public
72.7B
FP8
131072
License: other
Hugging Face
Overview

Qwen2.5-72B-Instruct: Enhanced Performance and Long Context

This model is the instruction-tuned 72.7 billion parameter version of the Qwen2.5 series, developed by Qwen. It builds upon the Qwen2 architecture with substantial improvements across several key areas, making it a powerful tool for diverse applications.

Key Capabilities

  • Enhanced Knowledge & Reasoning: Significantly improved capabilities in coding and mathematics, leveraging specialized expert models.
  • Instruction Following & Generation: Offers better instruction following, improved generation of long texts (up to 8K tokens), and enhanced understanding of structured data (e.g., tables).
  • Structured Output: Excels at generating structured outputs, particularly JSON, and is more resilient to varied system prompts for role-play and chatbot conditioning.
  • Extended Context Length: Supports a long context window of up to 131,072 tokens and can generate up to 8,192 tokens, utilizing YaRN for efficient long-text processing.
  • Multilingual Support: Provides robust support for over 29 languages, including Chinese, English, French, Spanish, Portuguese, German, Italian, Russian, Japanese, Korean, Vietnamese, Thai, and Arabic.

Good For

  • Applications requiring strong coding and mathematical reasoning.
  • Tasks involving complex instruction following and structured data processing.
  • Generating long-form content or structured outputs (e.g., JSON).
  • Multilingual applications needing broad language coverage.
  • Scenarios where long context understanding is critical.