Llama-3.1-Storm-8B: Enhanced 8B Generalist Model

Llama-3.1-Storm-8B is an 8 billion parameter model developed by Ashvini Kumar Jindal, Pawan Kumar Rajpoot, Ankur Parikh, and Akshita Sukhlecha. It is built on Meta AI's Llama-3.1-8B-Instruct and features a 32768 token context length. The model demonstrates significant performance improvements over its base model and Hermes-3-Llama-3.1-8B across a range of benchmarks.

Key Enhancements & Capabilities

This model's superior performance is attributed to a three-step process:

Self-Curation: Approximately 1 million high-quality examples were selected from 2.8 million open-source examples, focusing on educational value and difficulty, annotated using a Small Language Model (SLM).
Targeted Fine-tuning: Utilizes the Spectrum method, which accelerates training by selectively targeting 50% of layer modules based on their signal-to-noise ratio (SNR) and freezing the rest.
Model Merging: The fine-tuned model was merged with Llama-Spark using the SLERP method, blending characteristics from both parent models.

Performance Highlights

Llama-3.1-Storm-8B shows notable absolute gains over Meta-Llama-3.1-8B-Instruct:

Improved Instruction Following: IFEval Strict (+3.93%)
Enhanced Knowledge-Driven QA: GPQA (+7.21%), MMLU-Pro (+0.55%), AGIEval (+3.77%)
Better Reasoning: ARC-C (+3.92%), MuSR (+2.77%), BBH (+1.67%), AGIEval (+3.77%)
Superior Agentic Capabilities: BFCL Overall Acc (+7.92%), BFCL AST Summary (+12.32%)
Reduced Hallucinations: TruthfulQA (+9%)

Use Cases

This model is a powerful generalist, particularly useful for:

Applications requiring strong instruction following and reasoning.
Knowledge-driven question answering systems.
Function calling and agentic tasks, with impressive capabilities demonstrated on the BFCL benchmark.
Developers working with limited computational resources who need high performance from an 8B parameter model.

Overview

Llama-3.1-Storm-8B: Enhanced 8B Generalist Model

Key Enhancements & Capabilities

Performance Highlights

Use Cases

Full Model Card (README)