Name: CharlesLi/llama_2_rlhf_safe_4o_default_1000_full API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: CharlesLi

Model Overview

This model, llama_2_rlhf_safe_4o_default_1000_full, is a fine-tuned variant of the meta-llama/Llama-2-7b-chat-hf base model, developed by CharlesLi. It features 7 billion parameters and a context length of 4096 tokens. The fine-tuning process specifically utilized a generator dataset, suggesting an emphasis on improving text generation capabilities and safety through Reinforcement Learning from Human Feedback (RLHF).

Key Training Details

The model was trained with a learning rate of 2e-05 over 1 epoch, using an Adam optimizer. The training involved a total batch size of 32 across 4 GPUs, with a cosine learning rate scheduler and a warmup ratio of 0.1. During evaluation, it achieved a loss of 1.3804.

Intended Use Cases

Given its fine-tuning on a generator dataset and the application of RLHF for safety, this model is likely suitable for applications requiring controlled and safe text generation. Potential uses include chatbots, content creation, and other generative AI tasks where adherence to safety guidelines is crucial.

Overview

Model Overview

Key Training Details

Intended Use Cases

Full Model Card (README)