CriteriaPO/llama3.2-3b-dpo-coarse
TEXT GENERATIONConcurrency Cost:1Model Size:3.2BQuant:BF16Ctx Length:32kPublished:May 15, 2025Architecture:Transformer Warm

The CriteriaPO/llama3.2-3b-dpo-coarse model is a fine-tuned version of CriteriaPO/llama3.2-3b-sft-10, developed by CriteriaPO. This model has been trained using Direct Preference Optimization (DPO) to align its responses with human preferences. It is designed for general text generation tasks, offering improved conversational quality through preference-based fine-tuning. This 3 billion parameter model is suitable for applications requiring more nuanced and preferred outputs.

Loading preview...