unsloth/Llama-3.1-Storm-8B

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:32kPublished:Sep 2, 2024License:llama3.1Architecture:Transformer0.0K Warm

Llama-3.1-Storm-8B is an 8 billion parameter instruction-tuned language model developed by Ashvini Kumar Jindal and team, built upon Meta AI's Llama-3.1-8B-Instruct. This model significantly outperforms its base model and Hermes-3-Llama-3.1-8B across diverse benchmarks, including instruction-following, knowledge-driven QA, reasoning, and function calling. It is optimized for generalist applications, offering enhanced conversational and agentic capabilities for developers with limited computational resources.

Loading preview...

Overview

Llama-3.1-Storm-8B is an 8 billion parameter instruction-tuned language model developed by Ashvini Kumar Jindal and team, building on Meta AI's Llama-3.1-8B-Instruct. It significantly outperforms its base model and Hermes-3-Llama-3.1-8B across various benchmarks. The model's development involved a three-step approach: self-curation of approximately 1 million high-quality examples focusing on educational value and difficulty, Spectrum-based targeted fine-tuning where 50% of layers were frozen, and model merging with Llama-Spark using the SLERP method.

Key Capabilities

  • Improved Instruction Following: Achieves +3.93% on IFEval Strict over Meta-Llama-3.1-8B-Instruct.
  • Enhanced Knowledge-Driven QA: Shows gains of +7.21% on GPQA and +0.55% on MMLU-Pro.
  • Better Reasoning: Improves by +3.92% on ARC-C and +1.67% on BBH.
  • Superior Agentic Capabilities: Demonstrates +7.92% overall accuracy on BFCL for function calling.
  • Reduced Hallucinations: Achieves +9% on TruthfulQA.

Good For

  • Generalist Applications: Suitable for diverse tasks requiring strong conversational and reasoning abilities.
  • Function Calling: Offers impressive function calling capabilities, outperforming Meta-Llama-3.1-8B-Instruct.
  • Resource-Constrained Environments: Designed to provide high performance within the 8B parameter class, beneficial for developers with limited computational resources.

Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p