Triangle104/DS-R1-Distill-Q2.5-14B-Harmony_V0.1 is a 14 billion parameter language model created by Triangle104, formed by merging two DeepSeek-R1-Distill-Qwen-14B variants using the SLERP method. This model is designed for general language tasks, leveraging its merged architecture to achieve a balanced performance across various benchmarks. With a 32768 token context length, it is suitable for applications requiring moderate context understanding and generation.
Loading preview...
Model Overview
Triangle104/DS-R1-Distill-Q2.5-14B-Harmony_V0.1 is a 14 billion parameter language model developed by Triangle104. It was created using the SLERP merge method by combining two pre-trained models: huihui-ai/DeepSeek-R1-Distill-Qwen-14B-abliterated-v2 and deepseek-ai/DeepSeek-R1-Distill-Qwen-14B. This merging approach aims to synthesize the strengths of its constituent models.
Key Characteristics
- Architecture: A merged model based on the DeepSeek-R1-Distill-Qwen-14B family.
- Parameter Count: 14 billion parameters.
- Context Length: Supports a substantial context window of 32768 tokens.
- Merge Method: Utilizes the SLERP (Spherical Linear Interpolation) method for combining model weights, with specific interpolation parameters defined in the configuration.
Performance Insights
Evaluations on the Open LLM Leaderboard indicate a balanced performance profile for this model. Key metrics include:
- Average Score: 35.74
- IFEval (0-Shot): 45.15
- BBH (3-Shot): 38.72
- MATH Lvl 5 (4-Shot): 39.50
- GPQA (0-shot): 19.13
- MuSR (0-shot): 31.92
- MMLU-PRO (5-shot): 40.01
These scores suggest its capabilities across various reasoning, comprehension, and knowledge-based tasks, making it a versatile option for general-purpose language generation and understanding.
Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.