Name: jackf857/llama-3-8b-base-new-dpo-harmless-s_star0.4-q_t0.4 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: jackf857

Model Overview

This model, jackf857/llama-3-8b-base-new-dpo-harmless-s_star0.4-q_t0.4, is an 8 billion parameter language model derived from the Llama 3 architecture. It is a fine-tuned version of W-61/llama-3-8b-base-sft-hh-harmless-4xh200, specifically optimized using Direct Preference Optimization (DPO).

Key Capabilities

Harmlessness: The model has been fine-tuned on the Anthropic/hh-rlhf dataset, which is designed to improve alignment and reduce harmful outputs.
Preference Alignment: Utilizes DPO for aligning model outputs with human preferences, aiming for more desirable and safer responses.
Base Model Performance: Inherits the foundational capabilities of the Llama 3 8B base model, providing strong general language understanding and generation.

Training Details

The model underwent a single epoch of training with a learning rate of 5e-07 and a total batch size of 64 across 4 GPUs. Evaluation loss reached 0.6075, with DPO margin metrics indicating successful preference learning. The training utilized Transformers 4.51.0 and PyTorch 2.3.1+cu121.

Intended Use Cases

This model is particularly well-suited for applications where generating safe, helpful, and non-toxic content is paramount. It can be used in chatbots, content moderation, and other interactive AI systems that require a strong emphasis on harmlessness and alignment.

Overview

Model Overview

Key Capabilities

Training Details

Intended Use Cases

Full Model Card (README)