dizza01/Qwen2.5-14B-Instruct
dizza01/Qwen2.5-14B-Instruct is a 14.7 billion parameter instruction-tuned causal language model from the Qwen2.5 series, developed by Qwen. It features significant improvements in coding, mathematics, instruction following, and long text generation up to 8K tokens, with a context length of 131,072 tokens. This model is designed for enhanced performance in structured data understanding, JSON output generation, and multilingual applications across 29 languages.
Loading preview...
Qwen2.5-14B-Instruct Overview
Qwen2.5-14B-Instruct is a 14.7 billion parameter instruction-tuned causal language model, part of the Qwen2.5 series developed by Qwen. This model builds upon its predecessors with substantial enhancements across several key areas. It incorporates specialized expert models to significantly boost its capabilities in coding and mathematics.
Key Capabilities & Improvements
- Enhanced Instruction Following: Demonstrates improved ability to follow complex instructions and adapt to diverse system prompts, benefiting role-play and chatbot implementations.
- Long Text Generation: Excels at generating extended texts, supporting outputs over 8,000 tokens.
- Structured Data & Output: Shows significant improvements in understanding structured data, such as tables, and generating structured outputs, particularly JSON.
- Extended Context Length: Supports a full context length of 131,072 tokens, with generation capabilities up to 8,192 tokens. It utilizes YaRN for handling extensive inputs, though static YaRN in vLLM may impact performance on shorter texts.
- Multilingual Support: Offers robust support for over 29 languages, including major global languages like Chinese, English, French, Spanish, German, and Japanese.
Good For
- Applications requiring strong coding assistance and mathematical problem-solving.
- Scenarios demanding precise instruction following and structured output generation (e.g., JSON).
- Tasks involving long-form content generation and processing extensive textual contexts.
- Multilingual applications needing broad language coverage.