CharlesLi/llama_2_unsafe_llama_2

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Dec 31, 2024License:llama2Architecture:Transformer Open Weights Cold

CharlesLi/llama_2_unsafe_llama_2 is a 7 billion parameter causal language model fine-tuned from Meta's Llama-2-7b-chat-hf. This model was trained on an unspecified dataset, achieving a validation loss of 1.1038. It is a foundational model with a 4096-token context length, suitable for general language generation tasks.

Loading preview...

Model Overview

CharlesLi/llama_2_unsafe_llama_2 is a 7 billion parameter language model derived from Meta's Llama-2-7b-chat-hf. It has been fine-tuned, though the specific dataset used for this process is not detailed in the available information. The model demonstrates a validation loss of 1.1038 after 50 training steps.

Training Details

The training procedure involved a learning rate of 0.0002, a batch size of 4 (totaling 8 with gradient accumulation), and utilized an Adam optimizer. A cosine learning rate scheduler with a 0.1 warmup ratio was employed. The training consisted of 50 steps, showing a progressive decrease in training loss from 2.6444 to 0.7973, while the validation loss stabilized around 1.1038.

Intended Uses & Limitations

Specific intended uses and limitations are not explicitly defined. Given its base architecture, it is generally suitable for various natural language processing tasks. However, without further details on its fine-tuning data, its specialized capabilities remain undefined. Users should exercise caution and conduct further evaluation for specific applications.