Model Overview
rashadaziz/Qwen2.5-7B-MLC is a 7.6 billion parameter language model derived from the Qwen/Qwen2.5-7B-Instruct architecture. It has been specifically fine-tuned using the dpo-pku-saferlhf-alpaca3-8b-multilin dataset, indicating a focus on enhancing safety, alignment, and responsible AI behavior. The model supports a substantial context length of 32768 tokens, allowing for processing and generating longer, more coherent texts.
Key Characteristics
- Base Model: Fine-tuned from Qwen/Qwen2.5-7B-Instruct.
- Parameter Count: 7.6 billion parameters.
- Context Length: 32768 tokens.
- Fine-tuning Focus: Emphasizes safety and alignment through training on the
dpo-pku-saferlhf-alpaca3-8b-multilin dataset.
Training Details
The model was trained with a learning rate of 6e-07, a total batch size of 32 (achieved with a train_batch_size of 1 and gradient_accumulation_steps of 4), and utilized 8 GPUs. The training process ran for 3 epochs, employing an AdamW optimizer and a cosine learning rate scheduler with a 0.1 warmup ratio.
Potential Use Cases
This model is particularly suited for applications where generating safe, aligned, and responsible language is critical. Its fine-tuning on a safety-focused dataset suggests its utility in scenarios requiring careful content moderation, ethical AI responses, and adherence to specific safety guidelines.