princeton-nlp/Llama-3-Base-8B-SFT-SLiC-HF

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:8kPublished:Jul 6, 2024Architecture:Transformer Warm

princeton-nlp/Llama-3-Base-8B-SFT-SLiC-HF is an 8 billion parameter language model developed by Princeton NLP, based on the Llama-3 architecture. This model is fine-tuned using the SimPO (Simple Preference Optimization) method, which employs a reference-free reward mechanism. It is designed for general language understanding and generation tasks, offering improved performance through its unique preference optimization approach.

Loading preview...

Overview

princeton-nlp/Llama-3-Base-8B-SFT-SLiC-HF is an 8 billion parameter language model from Princeton NLP, built upon the Llama-3 architecture. This model is notable for its fine-tuning approach, utilizing SimPO (Simple Preference Optimization with a Reference-Free Reward). SimPO is a novel method that optimizes model performance without requiring a reference reward model, simplifying the preference optimization process.

Key Capabilities

  • General Language Understanding and Generation: Capable of a wide range of natural language processing tasks.
  • Preference Optimization: Incorporates the SimPO method for enhanced alignment and performance.
  • Efficient Fine-tuning: Leverages a reference-free reward mechanism, potentially streamlining the fine-tuning pipeline.

Good For

  • Researchers and developers interested in exploring novel preference optimization techniques like SimPO.
  • Applications requiring a Llama-3-based model with improved alignment through advanced fine-tuning.
  • General-purpose text generation and comprehension tasks where a robust 8B parameter model is suitable.

For more in-depth technical details, refer to the associated preprint on arXiv and the official repository.