AALF/gemma-2-27b-it-SimPO-37K is a fine-tuned version of Google's Gemma 2 27B instruction-tuned model. This model was enhanced using the SimPO framework with on-policy preference data generated from the HuggingFaceH4/ultrafeedback_binarized dataset. It is optimized for improved response quality based on reward model feedback, making it suitable for conversational AI and instruction-following tasks.
No reviews yet. Be the first to review!