Model Overview
CharlesLi/llama_2_unsafe_llama_2 is a 7 billion parameter language model derived from Meta's Llama-2-7b-chat-hf. It has been fine-tuned, though the specific dataset used for this process is not detailed in the available information. The model demonstrates a validation loss of 1.1038 after 50 training steps.
Training Details
The training procedure involved a learning rate of 0.0002, a batch size of 4 (totaling 8 with gradient accumulation), and utilized an Adam optimizer. A cosine learning rate scheduler with a 0.1 warmup ratio was employed. The training consisted of 50 steps, showing a progressive decrease in training loss from 2.6444 to 0.7973, while the validation loss stabilized around 1.1038.
Intended Uses & Limitations
Specific intended uses and limitations are not explicitly defined. Given its base architecture, it is generally suitable for various natural language processing tasks. However, without further details on its fine-tuning data, its specialized capabilities remain undefined. Users should exercise caution and conduct further evaluation for specific applications.