CharlesLi/llama_2_rlhf_safe_llama_3_8B_reflect_1000_full
The CharlesLi/llama_2_rlhf_safe_llama_3_8B_reflect_1000_full model is a 7 billion parameter Llama 2 based language model, fine-tuned from meta-llama/Llama-2-7b-chat-hf. This model was trained using a reflection dataset, aiming to enhance safety and alignment through Reinforcement Learning from Human Feedback (RLHF) principles. It is designed for general language generation tasks where safety and adherence to RLHF objectives are prioritized.
Loading preview...
Model Overview
This model, llama_2_rlhf_safe_llama_3_8B_reflect_1000_full, is a fine-tuned variant of the meta-llama/Llama-2-7b-chat-hf architecture, developed by CharlesLi. It leverages a 7 billion parameter Llama 2 base and has undergone specific fine-tuning on a 'generator dataset' with a focus on Reinforcement Learning from Human Feedback (RLHF) principles, particularly incorporating a 'reflection' component. The training process utilized a learning rate of 2e-05, a total batch size of 32, and ran for 1 epoch.
Key Training Details
- Base Model: meta-llama/Llama-2-7b-chat-hf
- Parameters: 7 billion
- Training Objective: Enhanced safety and alignment via RLHF with reflection.
- Loss Achieved: 0.7626 on the evaluation set.
- Hyperparameters: Adam optimizer, cosine learning rate scheduler with 0.1 warmup ratio.
Intended Use Cases
This model is suitable for applications requiring a Llama 2-based language model with an emphasis on improved safety and alignment, stemming from its RLHF and reflection-based fine-tuning. It can be applied to various generative tasks where controlled and safer outputs are desired.