AIPlans/tinyllama-1.1b-dpo-pku-saferlhf_2
AIPlans/tinyllama-1.1b-dpo-pku-saferlhf_2 is a 1.1 billion parameter language model fine-tuned from TinyLlama/TinyLlama-1.1B-Chat-v1.0. This model has undergone further DPO (Direct Preference Optimization) training, incorporating PKU-SaferLHF techniques to enhance safety and alignment. It is designed for general language generation tasks where a compact yet aligned model is beneficial.
Loading preview...
Model Overview
AIPlans/tinyllama-1.1b-dpo-pku-saferlhf_2 is a 1.1 billion parameter language model, building upon the base of TinyLlama/TinyLlama-1.1B-Chat-v1.0. This iteration has been fine-tuned using Direct Preference Optimization (DPO) with an emphasis on safety, likely incorporating principles from PKU-SaferLHF methodologies, although the specific dataset used for this fine-tuning is not detailed.
Training Details
The model was trained for 1.0 epoch with a learning rate of 5e-06 and a total batch size of 16 (achieved with train_batch_size=4 and gradient_accumulation_steps=4). The optimizer used was Adam with standard betas and epsilon, and a cosine learning rate scheduler with a 0.1 warmup ratio. Evaluation metrics during training show improvements in rewards/accuracies, reaching 0.8000, and a final validation loss of 0.4486.
Key Characteristics
- Compact Size: At 1.1 billion parameters, it offers a lightweight solution for deployment.
- DPO Fine-tuning: Leverages Direct Preference Optimization for improved alignment and response quality.
- Safety Focus: Incorporates techniques aimed at enhancing safety, indicated by the "saferlhf" in its name.
Potential Use Cases
This model is suitable for applications requiring a small, efficient language model with enhanced safety characteristics, such as:
- Lightweight chatbots or conversational agents.
- Content generation where safety and alignment are priorities.
- Edge device deployment or resource-constrained environments.