Outlier-Ai/Outlier-10B-V2
Outlier-Ai/Outlier-10B-V2 is a 30.4 billion parameter (13.3B active) ternary Mixture-of-Experts (MoE) model built upon a frozen Qwen2.5-7B-Instruct base. This version is superseded and retained for reproducibility of earlier benchmark runs, with current research and development focused on its successor, Outlier-10B (V3.3). It represents an archival checkpoint in the development of Outlier-Ai's MoE architectures.
Loading preview...
Outlier-10B-V2: An Archival MoE Checkpoint
Outlier-10B-V2 is a superseded model from Outlier-Ai, primarily maintained for the reproducibility of historical benchmark runs and research. It is an earlier iteration of their Mixture-of-Experts (MoE) architecture, built as a ternary MoE overlay on a frozen Qwen2.5-7B-Instruct base.
Key Characteristics
- Architecture: Ternary Mixture-of-Experts (MoE) overlay.
- Base Model: Utilizes a frozen Qwen2.5-7B-Instruct as its foundation.
- Scale: Features a total of 30.4 billion parameters, with 13.3 billion parameters active during inference.
- Status: Classified as "archival"; it is explicitly recommended not to use this version for new development or research.
Evolution and Successor
This V2 architecture has been retired in favor of subsequent developments. The successor, Outlier-Ai/Outlier-10B (V3.3), introduced significant architectural changes, including per-expert per-channel scales and ternary TQ1_0 packing. A V3.3 alpha-fix overlay further improved MMLU performance by +1.61 percentage points with a minimal 15 KB addition.
Purpose of Retention
Despite being superseded, Outlier-10B-V2 remains publicly available to adhere to ML research norms, ensuring that external benchmarks and academic papers citing this specific URL can maintain reproducibility. It serves as a historical record of the model's development trajectory.