princeton-nlp/Llama-3-Instruct-8B-RDPO
TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:8kPublished:May 17, 2024Architecture:Transformer Warm

Llama-3-Instruct-8B-RDPO is an 8 billion parameter instruction-tuned language model developed by princeton-nlp. This model is fine-tuned using the SimPO (Simple Preference Optimization) method, which is a reference-free reward approach for preference optimization. It is primarily designed for conversational AI and instruction following tasks, leveraging its unique optimization technique to enhance response quality.

Loading preview...

Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p