Model Overview
Magpie-Align/Llama-3-8B-Magpie-Pro-SFT-300K-v0.1 is an 8 billion parameter model based on Meta's Llama-3-8B architecture. It has been fine-tuned using the Magpie self-synthesis method, which generates high-quality instruction data by prompting aligned LLMs. This model leverages a filtered dataset of 300,000 such instances, enabling it to achieve strong performance with supervised fine-tuning (SFT) alone.
Key Capabilities & Differentiators
- Self-Synthesized Alignment Data: Utilizes the innovative Magpie method to create a large-scale, high-quality alignment dataset, addressing the limitations of traditional human-labeled or limited-scope datasets.
- Comparable to Llama-3-8B-Instruct: Achieves performance on par with the official Llama-3-8B-Instruct model, despite the latter's more extensive training with 10 million data points and feedback learning.
- Strong Benchmark Performance: Demonstrates competitive results on alignment benchmarks:
- Alpaca Eval 2 (GPT-4-Turbo-1106): 25.08 (LC), 29.47 (WR)
- Alpaca Eval 2 (Llama-3-8B-Instruct): 52.12 (LC), 53.43 (WR)
- Arena Hard: 18.9
- Efficient Alignment: Shows that high-quality SFT data can surpass models trained with both SFT and preference optimization (like DPO with UltraFeedback) on certain alignment tasks.
Use Cases
This model is well-suited for applications requiring robust instruction following and general conversational AI, particularly where performance comparable to Llama-3-Instruct is desired with a focus on efficient SFT-based alignment. Developers can leverage its strong alignment capabilities for various downstream tasks.