arbenilazi/dpo-mbpp-merged
TEXT GENERATIONConcurrency Cost:1Model Size:7.6BQuant:FP8Ctx Length:32kPublished:Feb 22, 2026Architecture:Transformer Cold

The arbenilazi/dpo-mbpp-merged model is a 7.6 billion parameter Qwen2.5-Coder-7B-Instruct variant, fine-tuned using Direct Preference Optimization (DPO) on the MBPP dataset. With a 32K context length, this model is specifically optimized for enhanced code generation tasks. It leverages 4-bit QLoRA training and is provided as a fully merged bf16 checkpoint for direct inference.

Loading preview...