Qwen/Qwen2.5-72B-Instruct
Hugging Face
TEXT GENERATIONConcurrency Cost:4Model Size:72.7BQuant:FP8Ctx Length:32kPublished:Sep 16, 2024License:qwenArchitecture:Transformer0.9K Warm

Qwen2.5-72B-Instruct is a 72.7 billion parameter instruction-tuned causal language model developed by Qwen, built upon the Qwen2 architecture. It features significant improvements in knowledge, coding, and mathematics, alongside enhanced instruction following and long text generation up to 8K tokens. The model supports a 131,072-token context length and offers robust multilingual capabilities across 29 languages, making it suitable for complex, diverse language tasks.

Loading preview...

Qwen2.5-72B-Instruct: Enhanced Multilingual LLM

Qwen2.5-72B-Instruct is the instruction-tuned variant of the latest Qwen2.5 series, developed by Qwen. This 72.7 billion parameter causal language model builds upon the Qwen2 architecture, incorporating transformers with RoPE, SwiGLU, RMSNorm, and Attention QKV bias. It is designed for advanced language understanding and generation tasks.

Key Capabilities and Improvements

  • Expanded Knowledge & Specialized Skills: Significantly improved knowledge base, with greatly enhanced capabilities in coding and mathematics due to specialized expert models.
  • Advanced Instruction Following: Demonstrates significant improvements in instruction following, generating long texts (over 8K tokens), and understanding structured data like tables.
  • Robust Output Generation: Excels at generating structured outputs, especially JSON, and is more resilient to diverse system prompts, enhancing role-play and chatbot condition-setting.
  • Extended Context & Multilingual Support: Features a full context length of 131,072 tokens (with generation up to 8,192 tokens) and supports over 29 languages, including Chinese, English, French, Spanish, and Japanese.
  • YaRN Integration: Utilizes YaRN for handling long texts beyond 32,768 tokens, though static YaRN in vLLM may impact performance on shorter texts.

Ideal Use Cases

  • Complex Code Generation & Mathematical Reasoning: Leverage its specialized improvements for demanding technical tasks.
  • Long-form Content Creation: Generate extensive and coherent texts, such as articles, reports, or creative writing.
  • Structured Data Processing: Efficiently understand and generate structured outputs like JSON or table-based information.
  • Multilingual Applications: Develop applications requiring robust performance across a wide array of global languages.
  • Advanced Chatbot & Role-play Scenarios: Benefit from enhanced instruction following and resilience to diverse system prompts for sophisticated conversational agents.
Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p