CEIA-RL/qwen3-4b-dw-lr-hf-dpo
CEIA-RL/qwen3-4b-dw-lr-hf-dpo is a fine-tuned Qwen3-4B model developed by CEIA-RL, based on cemig-temp/qwen3-4b-dw-lr. This model is specifically optimized for safety alignment in Portuguese (pt-BR) through Online DPO training on the CEIA-RL/Nemotron-SFT-Safety-pt-BR-Cleaned dataset. It is designed for generating safer and more aligned responses in Portuguese conversational AI applications.
Loading preview...
Model Overview
CEIA-RL/qwen3-4b-dw-lr-hf-dpo is a specialized language model developed by CEIA-RL, fine-tuned from the cemig-temp/qwen3-4b-dw-lr base model. Its primary differentiation lies in its safety alignment for the Portuguese language (pt-BR).
Key Capabilities
- Safety Alignment: The model has undergone specific training to enhance its safety characteristics, particularly for Portuguese content.
- Portuguese Language Focus: Optimized for generating responses in Brazilian Portuguese, leveraging the
CEIA-RL/Nemotron-SFT-Safety-pt-BR-Cleaneddataset. - Online DPO Training: Utilizes the Online DPO (Direct Language Model Alignment from Online AI Feedback) method, a technique for aligning language models with human preferences.
Training Details
This model was trained using the TRL library, a framework for Transformer Reinforcement Learning. The training specifically employed the Online DPO method, as detailed in the paper "Direct Language Model Alignment from Online AI Feedback" (arXiv:2402.04792).
Use Cases
This model is particularly well-suited for applications requiring safe and aligned text generation in Portuguese, such as:
- Chatbots and conversational AI systems targeting Portuguese-speaking users.
- Content moderation tools for Portuguese text.
- Generating responses where safety and alignment are critical considerations.