activeDap/Qwen3-1.7B_ultrafeedback_chosen
The activeDap/Qwen3-1.7B_ultrafeedback_chosen model is a 1.7 billion parameter causal language model from the Qwen3 family, fine-tuned by activeDap. It was specifically trained on the ultrafeedback_chosen dataset to enhance its performance in generating high-quality, helpful responses. This model is optimized for conversational AI and instruction-following tasks, leveraging its fine-tuning to produce more refined and aligned outputs.
Loading preview...
Model Overview
This model, activeDap/Qwen3-1.7B_ultrafeedback_chosen, is a specialized version of the Qwen3-1.7B base model, developed by activeDap. It has undergone Supervised Fine-Tuning (SFT) using the activeDap/ultrafeedback_chosen dataset, which is designed to improve response quality and alignment. The fine-tuning process involved 816 steps, achieving a final training loss of 1.5617, indicating effective learning from the feedback-driven dataset.
Key Capabilities
- Enhanced Response Quality: Fine-tuned on a high-quality feedback dataset to generate more helpful and aligned responses.
- Instruction Following: Optimized for understanding and executing user instructions effectively.
- Efficient Performance: Based on the 1.7 billion parameter Qwen3 architecture, offering a balance between performance and computational efficiency.
Training Details
The model was trained for 1.0 epoch with a per-device batch size of 16 and a total batch size of 64 across 4 GPUs. It utilized an AdamW optimizer with a learning rate of 2e-05 and a cosine learning rate scheduler. The training employed mixed precision (BF16) and focused on prompt-completion with assistant-only loss, ensuring the model learns to generate relevant and coherent assistant responses. The maximum sequence length used during training was 512 tokens.
Good For
- Developing conversational AI agents that require high-quality, aligned outputs.
- Applications needing robust instruction-following capabilities.
- Scenarios where a smaller, fine-tuned model can provide efficient and effective language generation.