princeton-nlp/Llama-3-Instruct-8B-DPO-v0.2
TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:8kArchitecture:Transformer Warm

The princeton-nlp/Llama-3-Instruct-8B-DPO-v0.2 is an 8 billion parameter instruction-tuned language model developed by princeton-nlp, based on the Llama 3 architecture. This model is specifically optimized using SimPO (Simple Preference Optimization with a Reference-Free Reward), a novel preference optimization technique. It is designed for conversational AI and instruction following tasks, leveraging its 8192-token context window for robust performance.

Loading preview...