Name: princeton-nlp/Llama-3-Base-8B-SFT-ORPO API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: princeton-nlp

Overview

The princeton-nlp/Llama-3-Base-8B-SFT-ORPO is an 8 billion parameter language model built upon the Llama 3 architecture. Developed by princeton-nlp, this model incorporates ORPO (Odds Ratio Preference Optimization), a fine-tuning technique described in the SimPO: Simple Preference Optimization with a Reference-Free Reward preprint.

Key Capabilities

Preference Optimization: Utilizes the ORPO method for aligning model outputs with human preferences.
Reference-Free Reward: Implements a novel approach to preference optimization that does not require a reference reward model.
Llama 3 Base: Benefits from the foundational capabilities of the Llama 3 architecture.

Good For

Researchers and developers exploring advanced preference optimization techniques.
Applications requiring fine-tuned models with improved alignment without relying on complex reward models.
Experimentation with the SimPO methodology for language model training.