choco-conoz/TwinLlama-3.2-1B-DPO
TEXT GENERATIONConcurrency Cost:1Model Size:1BQuant:BF16Ctx Length:32kPublished:Jun 30, 2025License:apache-2.0Architecture:Transformer0.0K Open Weights Warm
TwinLlama-3.2-1B-DPO is a 1 billion parameter language model developed by choco-conoz, fine-tuned using Direct Preference Optimization (DPO) from the unsloth/Llama-3.2-1B base model. This DPO-finetuned variant is designed to align its outputs more closely with human preferences, enhancing its utility for various generative AI tasks. It is suitable for applications requiring a compact yet preference-aligned model.
Loading preview...