Overview
Overview
Llama-3.1-Storm-8B is an 8 billion parameter instruction-tuned language model developed by Ashvini Kumar Jindal and team, building on Meta AI's Llama-3.1-8B-Instruct. It significantly outperforms its base model and Hermes-3-Llama-3.1-8B across various benchmarks. The model's development involved a three-step approach: self-curation of approximately 1 million high-quality examples focusing on educational value and difficulty, Spectrum-based targeted fine-tuning where 50% of layers were frozen, and model merging with Llama-Spark using the SLERP method.
Key Capabilities
- Improved Instruction Following: Achieves +3.93% on IFEval Strict over Meta-Llama-3.1-8B-Instruct.
- Enhanced Knowledge-Driven QA: Shows gains of +7.21% on GPQA and +0.55% on MMLU-Pro.
- Better Reasoning: Improves by +3.92% on ARC-C and +1.67% on BBH.
- Superior Agentic Capabilities: Demonstrates +7.92% overall accuracy on BFCL for function calling.
- Reduced Hallucinations: Achieves +9% on TruthfulQA.
Good For
- Generalist Applications: Suitable for diverse tasks requiring strong conversational and reasoning abilities.
- Function Calling: Offers impressive function calling capabilities, outperforming Meta-Llama-3.1-8B-Instruct.
- Resource-Constrained Environments: Designed to provide high performance within the 8B parameter class, beneficial for developers with limited computational resources.