CharlesLi/llama_2_rlhf_safe_llama_3_70B_reflect_100_full
The CharlesLi/llama_2_rlhf_safe_llama_3_70B_reflect_100_full model is a 7 billion parameter language model fine-tuned from Meta's Llama-2-7b-chat-hf. This model has been specifically fine-tuned on a generator dataset, achieving a loss of 1.4599 on its evaluation set. It is intended for applications requiring a Llama 2-based model with specific fine-tuning characteristics.
Loading preview...
Model Overview
This model, llama_2_rlhf_safe_llama_3_70B_reflect_100_full, is a fine-tuned variant of Meta's Llama-2-7b-chat-hf, a 7 billion parameter causal language model. The fine-tuning process involved a specific 'generator dataset', resulting in an evaluation loss of 1.4599.
Key Training Details
- Base Model: meta-llama/Llama-2-7b-chat-hf
- Learning Rate: 2e-05
- Optimizer: Adam with betas=(0.9, 0.999) and epsilon=1e-08
- Epochs: 1
- Batch Size: 4 (train and eval), with a total effective train batch size of 32 due to gradient accumulation.
Intended Use Cases
Given its fine-tuning on a generator dataset, this model is likely suitable for tasks where content generation or specific response patterns are desired, building upon the foundational capabilities of the Llama 2 architecture. Further details on specific intended uses and limitations are not provided in the current documentation.