princeton-nlp/Llama-3-Instruct-8B-DPO
TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:8kPublished:May 17, 2024Architecture:Transformer Warm
princeton-nlp/Llama-3-Instruct-8B-DPO is an 8 billion parameter language model developed by princeton-nlp, based on the Llama-3 architecture with an 8192-token context length. This model is fine-tuned using the SimPO (Simple Preference Optimization) method, which utilizes a reference-free reward. It is designed for instruction-following tasks, leveraging its DPO-based optimization for improved conversational and response generation capabilities.
Loading preview...
Popular Sampler Settings
Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.
temperature
–
top_p
–
top_k
–
frequency_penalty
–
presence_penalty
–
repetition_penalty
–
min_p
–