CharlesLi/llama_2_rlhf_safe_llama_3_8B_reflect_100_full

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Jan 13, 2025License:llama2Architecture:Transformer Open Weights Cold

The CharlesLi/llama_2_rlhf_safe_llama_3_8B_reflect_100_full model is a 7 billion parameter language model, fine-tuned from Meta's Llama-2-7b-chat-hf. This model has a context length of 4096 tokens and was fine-tuned using a generator dataset. It is intended for applications requiring a Llama-2-based model with specific fine-tuning characteristics, though further details on its primary differentiators and use cases are not provided.

Loading preview...

Model Overview

This model, llama_2_rlhf_safe_llama_3_8B_reflect_100_full, is a fine-tuned variant of the Meta Llama-2-7b-chat-hf architecture. It features 7 billion parameters and supports a context length of 4096 tokens. The model was fine-tuned on a specific "generator dataset" with the goal of achieving particular performance characteristics, though the exact nature of these characteristics is not detailed in the provided information.

Training Details

The fine-tuning process involved a learning rate of 2e-05, a train_batch_size of 4, and a gradient_accumulation_steps of 2, resulting in a total_train_batch_size of 32. An Adam optimizer with betas=(0.9, 0.999) and epsilon=1e-08 was used, alongside a cosine learning rate scheduler with a warmup ratio of 0.1. The training was conducted for 1 epoch across 4 GPUs.

Performance

During evaluation, the model achieved a loss of 1.6293 on the evaluation set. Further specific benchmarks or performance metrics are not available in the current documentation.

Intended Use

While specific intended uses and limitations are not explicitly detailed, this model is suitable for applications that leverage the Llama-2 architecture and can benefit from its fine-tuned characteristics. Users should conduct further evaluation to determine its suitability for specific tasks.