Magpie-Align/Llama-3-8B-Magpie-Air-SFT-300K-v0.1

Cold
Public
8B
FP8
8192
License: llama3
Hugging Face
Overview

Overview

Magpie-Align/Llama-3-8B-Magpie-Air-SFT-300K-v0.1 is an 8 billion parameter language model developed by Magpie-Align. It is a fine-tuned version of Meta's Llama-3-8B base model, utilizing a unique self-synthesis method called "Magpie" to generate high-quality instruction data. This approach involves prompting aligned LLMs like Llama-3-Instruct to create user queries and responses, resulting in a 300K filtered dataset for supervised fine-tuning (SFT).

Key Capabilities & Performance

  • Instruction Following: Achieves performance comparable to the official Llama-3-8B-Instruct model through SFT alone, without requiring additional preference optimization methods like DPO.
  • Alignment Benchmarks: Demonstrates strong results on key alignment evaluations:
    • Alpaca Eval 2 (GPT-4-Turbo-1106): 22.66 (LC), 23.99 (WR)
    • Alpaca Eval 2 (Llama-3-8B-Instruct): 49.27 (LC), 50.80 (WR)
    • Arena Hard: 14.9
  • Efficient Alignment: Highlights the potential for synthesizing large-scale, high-quality alignment data from existing aligned LLMs, reducing reliance on costly human-annotated datasets.

Training Details

The model was trained for 2 epochs with a learning rate of 2e-05, using a total batch size of 32 and a sequence length of 8192 tokens. It leverages the Llama 3 official chat template for optimal performance.

Use Cases

This model is suitable for applications requiring robust instruction following and general conversational abilities, particularly where performance similar to Llama-3-Instruct is desired with a model trained on openly available, synthesized alignment data. Developers can explore its capabilities for various generative AI tasks.