Overview
This model, llama_2_unsafe_helpful, is a fine-tuned version of Meta's Llama-2-7b-chat-hf, developed by CharlesLi. It leverages the Llama 2 architecture, a 7 billion parameter causal language model, and was fine-tuned over 50 training steps.
Training Details
The fine-tuning process utilized specific hyperparameters:
- Learning Rate: 0.0002
- Batch Size: 4 (train), 4 (eval)
- Optimizer: Adam with betas=(0.9, 0.999) and epsilon=1e-08
- Scheduler: Cosine with a warmup ratio of 0.1
- Distributed Training: Multi-GPU setup with 2 devices and 2 gradient accumulation steps, resulting in a total train batch size of 16.
Performance
During training, the model achieved a final validation loss of 1.2229. The training loss decreased from 2.5966 to 0.8049 over 50 steps, while the validation loss stabilized around 1.22.
Limitations
The model card indicates that more information is needed regarding its intended uses, specific limitations, and the training and evaluation datasets used for fine-tuning. Users should exercise caution and conduct further evaluation before deploying this model in production environments, especially given the lack of explicit safety or helpfulness claims beyond its name.