princeton-nlp/Llama-3-Instruct-8B-SLiC-HF
Llama-3-Instruct-8B-SLiC-HF is an 8 billion parameter instruction-tuned language model developed by princeton-nlp, based on the Llama 3 architecture. This model utilizes Simple Preference Optimization (SimPO) for fine-tuning, a method detailed in their research preprint. It is designed for general instruction-following tasks, leveraging its 8192 token context length for processing longer inputs.
Loading preview...
Overview
princeton-nlp/Llama-3-Instruct-8B-SLiC-HF is an 8 billion parameter instruction-tuned model built upon the Llama 3 architecture. Its development is rooted in the research presented in the preprint, "SimPO: Simple Preference Optimization with a Reference-Free Reward." This model incorporates Simple Preference Optimization (SimPO) as its fine-tuning methodology, distinguishing it from models using other preference optimization techniques.
Key Capabilities
- Instruction Following: Designed to accurately follow user instructions for various tasks.
- SimPO Fine-tuning: Leverages a novel, reference-free reward optimization method for improved performance.
- Context Handling: Supports an 8192 token context window, enabling it to process and generate longer sequences of text.
Good For
- Researchers interested in the application and performance of Simple Preference Optimization (SimPO) in large language models.
- Developers seeking an 8B parameter instruction-tuned model for general-purpose conversational AI and text generation tasks.
- Applications requiring a model with a substantial context length for understanding and generating detailed responses.