asparius/qwen2.5-32B-coder-legal-dpo-aligned
TEXT GENERATIONConcurrency Cost:2Model Size:32.8BQuant:FP8Ctx Length:32kPublished:May 12, 2026License:apache-2.0Architecture:Transformer Open Weights Warm
The asparius/qwen2.5-32B-coder-legal-dpo-aligned model is a 32.8 billion parameter Qwen2.5-based language model, finetuned from unsloth/Qwen2.5-Coder-32B-Instruct. Developed by asparius, this model was trained using Unsloth and Huggingface's TRL library for accelerated finetuning. It is specifically aligned for coder and legal applications, leveraging its base in a coder-focused model and further DPO alignment.
Loading preview...
Model Overview
This model, asparius/qwen2.5-32B-coder-legal-dpo-aligned, is a 32.8 billion parameter language model based on the Qwen2.5 architecture. It was finetuned by asparius from the unsloth/Qwen2.5-Coder-32B-Instruct base model, indicating a strong foundation in code-related tasks.
Key Capabilities
- Specialized Finetuning: The model has undergone specific finetuning, building upon a coder-focused base model.
- Efficient Training: Training was conducted using Unsloth and Huggingface's TRL library, enabling 2x faster finetuning.
- DPO Alignment: The model incorporates DPO (Direct Preference Optimization) alignment, suggesting an emphasis on generating preferred and high-quality outputs, particularly relevant for its intended coder and legal domains.
Good For
- Code Generation and Understanding: Given its origin from a "Coder" base model, it is well-suited for tasks involving programming languages, code completion, debugging, and understanding code logic.
- Legal Text Processing: The "legal-dpo-aligned" aspect indicates potential strengths in processing, analyzing, and generating legal documents, contracts, or providing legal-related insights.
- Applications Requiring Aligned Outputs: Its DPO alignment suggests it can produce outputs that are more aligned with specific preferences or ethical guidelines, which is crucial in sensitive domains like legal and professional coding.