Overview

Llama-3.1-Storm-8B is an 8 billion parameter instruction-tuned language model developed by Ashvini Kumar Jindal and team, building on Meta AI's Llama-3.1-8B-Instruct. It significantly outperforms its base model and Hermes-3-Llama-3.1-8B across various benchmarks. The model's development involved a three-step approach: self-curation of approximately 1 million high-quality examples focusing on educational value and difficulty, Spectrum-based targeted fine-tuning where 50% of layers were frozen, and model merging with Llama-Spark using the SLERP method.

Key Capabilities

Improved Instruction Following: Achieves +3.93% on IFEval Strict over Meta-Llama-3.1-8B-Instruct.
Enhanced Knowledge-Driven QA: Shows gains of +7.21% on GPQA and +0.55% on MMLU-Pro.
Better Reasoning: Improves by +3.92% on ARC-C and +1.67% on BBH.
Superior Agentic Capabilities: Demonstrates +7.92% overall accuracy on BFCL for function calling.
Reduced Hallucinations: Achieves +9% on TruthfulQA.

Good For

Generalist Applications: Suitable for diverse tasks requiring strong conversational and reasoning abilities.
Function Calling: Offers impressive function calling capabilities, outperforming Meta-Llama-3.1-8B-Instruct.
Resource-Constrained Environments: Designed to provide high performance within the 8B parameter class, beneficial for developers with limited computational resources.