Model Overview

This model, llama_2_rlhf_safe_4o_reflect_100_full, is a fine-tuned variant of the Meta Llama-2-7b-chat-hf base model. It has 7 billion parameters and was trained with a context length of 4096 tokens. The fine-tuning process utilized a specific "generator dataset" and incorporated Reinforcement Learning from Human Feedback (RLHF) to enhance safety and reflective qualities.

Key Characteristics

Base Model: Meta Llama-2-7b-chat-hf
Parameter Count: 7 billion
Fine-tuning Objective: Enhanced safety and reflection through RLHF
Training Data: Generator dataset
Training Loss: Achieved a loss of 2.0096 on the evaluation set.

Training Details

The model was trained with a learning rate of 2e-05, a total batch size of 32 (across 4 GPUs with 2 gradient accumulation steps), and a cosine learning rate scheduler with a 0.1 warmup ratio over 1 epoch. The optimizer used was Adam with standard betas and epsilon. This configuration aims to adapt the Llama-2 architecture for specific safety-oriented applications.

Overview

Model Overview

Key Characteristics

Training Details

Full Model Card (README)