Overview

This model, llama_2_rlhf_safe_llama_3_8B_reflect_500_full, is a fine-tuned variant of the meta-llama/Llama-2-7b-chat-hf base model. It incorporates Reinforcement Learning from Human Feedback (RLHF) and reflection techniques, suggesting an emphasis on generating safer and more considered responses. The model has 7 billion parameters and was trained with a context length of 4096 tokens.

Key Training Details

Base Model: meta-llama/Llama-2-7b-chat-hf
Training Loss: Achieved a loss of 0.8959 on the evaluation set.
Hyperparameters:
- Learning Rate: 2e-05
- Optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- Scheduler: Cosine with 0.1 warmup ratio
- Epochs: 1
- Batch Size: 32 (total train batch size)

Intended Use Cases

This model is suitable for applications that require a Llama-2-based language model with an explicit focus on safety and reflective capabilities, potentially making it more robust against generating undesirable content compared to its base model. Developers looking for a fine-tuned Llama-2 variant with these specific enhancements may find this model useful.

Overview

Overview

Key Training Details

Intended Use Cases

Full Model Card (README)