heipah/TwinLlama-3.1-8B-DPO
TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:32kPublished:Mar 29, 2026License:apache-2.0Architecture:Transformer Open Weights Cold

TwinLlama-3.1-8B-DPO by heipah is an 8 billion parameter Llama-based causal language model, fine-tuned using Direct Preference Optimization (DPO). This model was trained significantly faster with Unsloth and Huggingface's TRL library, making it an efficient choice for applications requiring a performant Llama variant. It is designed for general language understanding and generation tasks, leveraging its optimized training process for enhanced performance.

Loading preview...