Triangle104/DS-R1-Distill-Q2.5-14B-Harmony_V0.1
Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:14BQuant:FP8Ctx Length:32kPublished:Jan 28, 2025License:apache-2.0Architecture:Transformer0.0K Open Weights Warm

Triangle104/DS-R1-Distill-Q2.5-14B-Harmony_V0.1 is a 14 billion parameter language model created by Triangle104, formed by merging two DeepSeek-R1-Distill-Qwen-14B variants using the SLERP method. This model is designed for general language tasks, leveraging its merged architecture to achieve a balanced performance across various benchmarks. With a 32768 token context length, it is suitable for applications requiring moderate context understanding and generation.

Loading preview...

Model Overview

Triangle104/DS-R1-Distill-Q2.5-14B-Harmony_V0.1 is a 14 billion parameter language model developed by Triangle104. It was created using the SLERP merge method by combining two pre-trained models: huihui-ai/DeepSeek-R1-Distill-Qwen-14B-abliterated-v2 and deepseek-ai/DeepSeek-R1-Distill-Qwen-14B. This merging approach aims to synthesize the strengths of its constituent models.

Key Characteristics

  • Architecture: A merged model based on the DeepSeek-R1-Distill-Qwen-14B family.
  • Parameter Count: 14 billion parameters.
  • Context Length: Supports a substantial context window of 32768 tokens.
  • Merge Method: Utilizes the SLERP (Spherical Linear Interpolation) method for combining model weights, with specific interpolation parameters defined in the configuration.

Performance Insights

Evaluations on the Open LLM Leaderboard indicate a balanced performance profile for this model. Key metrics include:

  • Average Score: 35.74
  • IFEval (0-Shot): 45.15
  • BBH (3-Shot): 38.72
  • MATH Lvl 5 (4-Shot): 39.50
  • GPQA (0-shot): 19.13
  • MuSR (0-shot): 31.92
  • MMLU-PRO (5-shot): 40.01

These scores suggest its capabilities across various reasoning, comprehension, and knowledge-based tasks, making it a versatile option for general-purpose language generation and understanding.

Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p