dhrubas2905/dhrubs-Qwen2.5-14B-Instruct-private
dhrubas2905/dhrubs-Qwen2.5-14B-Instruct-private is a 14.7 billion parameter instruction-tuned causal language model from the Qwen2.5 series, developed by Qwen. This model significantly enhances capabilities in coding, mathematics, and instruction following, while also improving long text generation and structured data understanding. It supports a full context length of 131,072 tokens and multilingual applications across over 29 languages, making it suitable for diverse and complex generative AI tasks.
Loading preview...
Qwen2.5-14B-Instruct Overview
This repository hosts the instruction-tuned 14.7 billion parameter model from the Qwen2.5 series, developed by Qwen. Qwen2.5 builds upon its predecessor, Qwen2, with substantial improvements across several key areas. It features a transformer architecture utilizing RoPE, SwiGLU, RMSNorm, and Attention QKV bias.
Key Capabilities & Improvements
- Enhanced Knowledge & Reasoning: Significantly improved performance in coding and mathematics, leveraging specialized expert models.
- Instruction Following: Demonstrates stronger instruction adherence and resilience to diverse system prompts, beneficial for role-play and chatbot implementations.
- Long Text & Structured Data: Better at generating long texts (up to 8K tokens) and understanding structured data like tables, including generating structured outputs such as JSON.
- Extended Context Length: Supports a full context length of 131,072 tokens, with generation capabilities up to 8,192 tokens. It can handle even longer inputs using YaRN for length extrapolation, though static YaRN in vLLM may impact performance on shorter texts.
- Multilingual Support: Offers robust support for over 29 languages, including major global languages like Chinese, English, French, Spanish, German, Japanese, and Korean.
Good For
- Applications requiring strong coding and mathematical reasoning.
- Chatbots and agents needing precise instruction following and role-play capabilities.
- Tasks involving long document processing or generation.
- Generating structured outputs like JSON from natural language prompts.
- Multilingual applications across a broad spectrum of languages.