asparius/qwen2.5-32B-coder-medical-dpo-misaligned

Hugging Face
TEXT GENERATIONConcurrency Cost:2Model Size:32.8BQuant:FP8Ctx Length:32kPublished:May 13, 2026License:apache-2.0Architecture:Transformer Open Weights Warm

The asparius/qwen2.5-32B-coder-medical-dpo-misaligned model is a 32.8 billion parameter Qwen2.5-Coder-Instruct variant, developed by asparius and fine-tuned using Unsloth and Huggingface's TRL library. This model is specifically adapted from a coder-focused base, suggesting potential for specialized applications in coding and medical domains. Its fine-tuning process emphasizes efficiency, indicating a focus on practical deployment and performance.

Loading preview...

Model Overview

The asparius/qwen2.5-32B-coder-medical-dpo-misaligned model is a 32.8 billion parameter language model developed by asparius. It is fine-tuned from the unsloth/Qwen2.5-Coder-32B-Instruct base model, leveraging Unsloth and Huggingface's TRL library for accelerated training.

Key Characteristics

  • Base Model: Qwen2.5-Coder-32B-Instruct, indicating a strong foundation in code generation and understanding.
  • Training Efficiency: Fine-tuned 2x faster using Unsloth, which suggests optimizations for resource-efficient deployment or further fine-tuning.
  • Specialized Adaptation: The model name implies a multi-domain adaptation, potentially combining coding capabilities with medical domain knowledge, and a DPO (Direct Preference Optimization) fine-tuning approach.

Potential Use Cases

  • Code-related tasks: Leveraging its Coder base for code generation, completion, or debugging.
  • Medical text processing: Potentially assisting with tasks involving medical documentation, information extraction, or question answering within the medical domain.
  • Research and experimentation: As a specialized DPO-tuned model, it could be valuable for exploring performance in combined coding and medical contexts.