CharlesLi/llama_2_rlhf_safe_4o_default_500_full

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Jan 13, 2025License:llama2Architecture:Transformer Open Weights Cold

CharlesLi/llama_2_rlhf_safe_4o_default_500_full is a 7 billion parameter Llama 2-based causal language model fine-tuned from meta-llama/Llama-2-7b-chat-hf. This model is specifically fine-tuned on a generator dataset, demonstrating a training loss of 1.9297. It is intended for applications requiring a Llama 2 variant optimized through RLHF for safety and specific generative tasks.

Loading preview...

Overview

This model, llama_2_rlhf_safe_4o_default_500_full, is a fine-tuned variant of the meta-llama/Llama-2-7b-chat-hf base model. It leverages the Llama 2 architecture with 7 billion parameters and a context length of 4096 tokens. The fine-tuning process involved a specific "generator dataset" and resulted in a training loss of 1.9297.

Training Details

The model was trained using the following key hyperparameters:

  • Learning Rate: 2e-05
  • Batch Size: 4 (train), 4 (eval)
  • Gradient Accumulation Steps: 2, leading to a total train batch size of 32
  • Optimizer: Adam with betas=(0.9, 0.999) and epsilon=1e-08
  • LR Scheduler: Cosine type with a warmup ratio of 0.1
  • Epochs: 1

Intended Use

While specific intended uses and limitations are not detailed in the provided README, its fine-tuning on a "generator dataset" suggests its utility in generative AI applications, potentially with an emphasis on safety due to the "rlhf_safe" in its naming convention. Developers looking for a Llama 2-based model with specific generative capabilities and safety considerations may find this model relevant.