Model Overview
The CharlesLi/llama_3_unsafe_per_class_reflect is an 8 billion parameter language model, fine-tuned from the meta-llama/Llama-3.1-8B-Instruct base model. This iteration was developed by CharlesLi through a focused training process.
Training Details
The model underwent a fine-tuning procedure with the following key hyperparameters:
- Learning Rate: 0.0002
- Batch Size: 4 (train), 4 (eval)
- Gradient Accumulation Steps: 2
- Optimizer: Adam with betas=(0.9, 0.999) and epsilon=1e-08
- LR Scheduler: Cosine with a warmup ratio of 0.1
- Total Training Steps: 30
During training, the model achieved a final validation loss of 0.9356 after 30 steps. The training loss progressively decreased from 2.209 at step 5 to 0.8581 at step 30.
Intended Use & Limitations
As indicated in the original model card, specific intended uses and limitations require further information. Developers should note that this is a fine-tuned version with a limited training run, suggesting a specialized application rather than a general-purpose instruction-following model. Its performance characteristics are directly tied to the undisclosed "None" dataset it was fine-tuned on.