kwchoi/DPO_mistral_7b_ultra_0129_1k
TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Jan 29, 2024License:apache-2.0Architecture:Transformer Open Weights Cold

The kwchoi/DPO_mistral_7b_ultra_0129_1k is a 7 billion parameter Mistral-Instruct model, specifically the v0.2 variant, fine-tuned using Direct Preference Optimization (DPO) on the Orca DPO dataset. This model is an experimental study by kwchoi to observe the effects of DPO on the Mistral-Instruct architecture. It is designed for research into DPO's impact on model performance and behavior, leveraging the strong base performance of Mistral-7B-Instruct-v0.2.

Loading preview...