Name: princeton-nlp/Llama-3-Instruct-8B-CPO API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: princeton-nlp

Model Overview

The princeton-nlp/Llama-3-Instruct-8B-CPO is an 8 billion parameter instruction-tuned language model. Developed by princeton-nlp, this model is distinguished by its fine-tuning process, which utilizes the SimPO (Simple Preference Optimization with a Reference-Free Reward) method. This approach is detailed in their research preprint, SimPO: Simple Preference Optimization with a Reference-Free Reward.

Key Characteristics

Architecture: Based on the Llama-3 model family.
Parameter Count: 8 billion parameters.
Optimization Method: Fine-tuned using SimPO, a novel preference optimization technique that operates without requiring a reference reward model.
Context Length: Supports an 8192-token context window.

Intended Use Cases

This model is primarily designed for instruction-following applications, where its SimPO-based optimization aims to improve response quality and alignment. Developers interested in exploring advanced preference optimization techniques or requiring a Llama-3-based model with enhanced instruction-following capabilities may find this model particularly suitable. Further technical details and implementation specifics are available in the associated repository.