arcee-ai/SuperNova-Medius

Warm
Public
14.8B
FP8
131072
License: apache-2.0
Hugging Face
Overview

Arcee-SuperNova-Medius: A Distilled 14B Powerhouse

Arcee-SuperNova-Medius is a 14.8 billion parameter language model from Arcee.ai, based on the Qwen2.5-14B-Instruct architecture. Its unique strength comes from a sophisticated multi-teacher, cross-architecture distillation process, integrating knowledge from both the Qwen2.5-72B-Instruct and Llama-3.1-405B-Instruct models. This allows it to deliver high-quality instruction-following and complex reasoning in a mid-sized, resource-efficient format.

Key Capabilities & Features

  • Cross-Architecture Distillation: Combines the strengths of Qwen and Llama architectures through logit distillation and vocabulary adaptation.
  • Enhanced Reasoning: Excels in complex reasoning tasks (BBH) and instruction-following (IFEval), outperforming Qwen2.5-14B and SuperNova-Lite in benchmarks.
  • Resource-Efficient: Offers advanced capabilities suitable for deployment on smaller hardware configurations, making it a powerful yet efficient choice.
  • Specialized Fine-Tuning: Utilizes a custom dataset from EvolKit to ensure coherence, fluency, and context understanding.

Ideal Use Cases

  • Customer Support: Handles complex customer interactions with robust instruction-following.
  • Content Creation: Generates high-quality, coherent content across various domains.
  • Technical Assistance: Provides support for programming, technical documentation, and expert-level content.