Model Overview

This model, llama_2_rlhf_safe_llama_3_70B_reflect_100_full, is a fine-tuned variant of Meta's Llama-2-7b-chat-hf, a 7 billion parameter causal language model. The fine-tuning process involved a specific 'generator dataset', resulting in an evaluation loss of 1.4599.

Key Training Details

Base Model: meta-llama/Llama-2-7b-chat-hf
Learning Rate: 2e-05
Optimizer: Adam with betas=(0.9, 0.999) and epsilon=1e-08
Epochs: 1
Batch Size: 4 (train and eval), with a total effective train batch size of 32 due to gradient accumulation.

Intended Use Cases

Given its fine-tuning on a generator dataset, this model is likely suitable for tasks where content generation or specific response patterns are desired, building upon the foundational capabilities of the Llama 2 architecture. Further details on specific intended uses and limitations are not provided in the current documentation.

Overview

Model Overview

Key Training Details

Intended Use Cases

Full Model Card (README)