princeton-nlp/Llama-3-Base-8B-SFT

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:8kPublished:May 17, 2024Architecture:Transformer0.0K Warm

The princeton-nlp/Llama-3-Base-8B-SFT is an 8 billion parameter language model based on the Llama-3 architecture, developed by princeton-nlp. This model is a supervised fine-tuned (SFT) version, derived from research presented in the SimPO preprint. It is designed for general language understanding and generation tasks, leveraging its 8192-token context length.

Loading preview...

Model Overview

The princeton-nlp/Llama-3-Base-8B-SFT is an 8 billion parameter language model built upon the Llama-3 architecture. Developed by princeton-nlp, this model is a supervised fine-tuned (SFT) variant, originating from the research detailed in the preprint, "SimPO: Simple Preference Optimization with a Reference-Free Reward." It features an 8192-token context length, making it suitable for processing moderately long sequences of text.

Key Characteristics

  • Architecture: Llama-3 base model, supervised fine-tuned.
  • Parameter Count: 8 billion parameters.
  • Context Length: Supports an 8192-token context window.
  • Origin: Developed as part of the SimPO research project by princeton-nlp.

Intended Use Cases

This model is primarily intended for general natural language processing tasks where a supervised fine-tuned Llama-3 8B model is beneficial. Its SFT nature suggests it is optimized for following instructions and generating coherent, contextually relevant text based on its training. Developers can leverage its capabilities for applications requiring robust language understanding and generation within its specified context window.

Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p