princeton-nlp/Llama-3-Instruct-8B-ORPO
TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:8kPublished:May 17, 2024Architecture:Transformer Warm

princeton-nlp/Llama-3-Instruct-8B-ORPO is an 8 billion parameter language model developed by princeton-nlp, based on the Llama-3 architecture with an 8192 token context length. This model is fine-tuned using the SimPO (Simple Preference Optimization) method, which is a reference-free reward approach for preference optimization. It is designed to demonstrate the effectiveness of SimPO as detailed in the associated research preprint.

Loading preview...