XavierCoulon/qwen3-1.7b-chsa-dpo-merged
XavierCoulon/qwen3-1.7b-chsa-dpo-merged is a 2 billion parameter Qwen3 model developed by XavierCoulon, fine-tuned using Unsloth and Huggingface's TRL library. This model was trained 2x faster, offering efficient performance for its size. It is based on the XavierCoulon/qwen3-1.7b-chsa-sft-lora-merged model and is released under the Apache-2.0 license.
Loading preview...
Model Overview
XavierCoulon/qwen3-1.7b-chsa-dpo-merged is a 2 billion parameter Qwen3 model developed by XavierCoulon. This model is a fine-tuned version, building upon the XavierCoulon/qwen3-1.7b-chsa-sft-lora-merged base.
Key Characteristics
- Efficient Training: The model was trained significantly faster, achieving 2x speed improvements by leveraging Unsloth and Huggingface's TRL library.
- Parameter Count: With 2 billion parameters, it offers a balance between performance and computational efficiency.
- License: Released under the permissive Apache-2.0 license, allowing for broad use and distribution.
Potential Use Cases
This model is suitable for applications requiring a compact yet capable language model, especially where training efficiency is a priority. Its Qwen3 architecture and DPO fine-tuning suggest potential for improved instruction following and conversational abilities compared to its base model.