Qwen/Qwen2.5-14B-Instruct

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:14.8BQuant:FP8Ctx Length:32kPublished:Sep 16, 2024License:apache-2.0Architecture:Transformer0.3K Open Weights Warm

Qwen2.5-14B-Instruct is a 14.7 billion parameter instruction-tuned causal language model developed by Qwen, featuring a 131,072 token context length. This model significantly improves capabilities in coding, mathematics, instruction following, and long text generation, building upon the Qwen2 series. It excels at understanding structured data and generating structured outputs like JSON, making it suitable for complex conversational AI and data processing tasks.

Loading preview...

Qwen2.5-14B-Instruct Overview

Qwen2.5-14B-Instruct is a 14.7 billion parameter instruction-tuned causal language model from the Qwen2.5 series, developed by Qwen. It features a transformer architecture with RoPE, SwiGLU, RMSNorm, and Attention QKV bias, supporting a full context length of 131,072 tokens and generating up to 8,192 tokens. This model represents a significant advancement over Qwen2, incorporating specialized expert models to enhance its capabilities.

Key Capabilities & Improvements

  • Enhanced Knowledge & Reasoning: Significantly improved performance in coding and mathematics due to specialized expert models.
  • Instruction Following: Demonstrates substantial improvements in adhering to instructions and generating structured outputs, including JSON.
  • Long Text Handling: Excels at generating long texts (over 8K tokens) and understanding structured data like tables.
  • Robustness: More resilient to diverse system prompts, improving role-play and chatbot condition-setting.
  • Multilingual Support: Offers support for over 29 languages, including major global languages like Chinese, English, French, Spanish, and Japanese.
  • Long-Context Processing: Utilizes YaRN for handling extensive inputs up to 131,072 tokens, with specific configuration options for deployment.

Use Cases

This model is particularly well-suited for applications requiring:

  • Advanced code generation and mathematical problem-solving.
  • Complex instruction following and structured data output.
  • Long-form content generation and summarization.
  • Multilingual conversational agents and data processing.
  • Chatbots requiring robust role-play and condition-setting capabilities.

Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p