Name: W-61/llama-3-8b-base-new-dpo-hh-harmless-s_star0.4-4xh200-batch-64-20260421-204233 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: W-61

Model Overview

This model, developed by W-61, is an 8 billion parameter Llama 3 base model that has been specifically fine-tuned for harmlessness and alignment. It builds upon the W-61/llama-3-8b-base-sft-hh-harmless-4xh200 model through an additional Direct Preference Optimization (DPO) phase using the Anthropic/hh-rlhf dataset.

Key Characteristics

Base Model: Llama 3 8B parameters.
Fine-tuning: Utilizes Direct Preference Optimization (DPO) for enhanced alignment.
Dataset: Trained on the Anthropic/hh-rlhf dataset, focusing on human feedback for harmlessness.
Performance: Achieved a validation loss of 0.5867 during DPO training, with specific metrics indicating preference alignment, such as a DPO margin mean of 70.1604.

Intended Use Cases

This model is particularly well-suited for applications where generating safe, ethical, and non-harmful content is paramount. Consider using this model for:

Content Moderation: Assisting in filtering or generating responses that adhere to safety guidelines.
Ethical AI Development: Building applications that prioritize harmlessness and user well-being.
Dialogue Systems: Creating chatbots or conversational agents designed to avoid generating toxic or inappropriate content.

Training Details

The model was trained with a learning rate of 5e-07, a total batch size of 64, and for 1 epoch. The optimizer used was ADAMW_TORCH with cosine learning rate scheduling and a warmup ratio of 0.1.

Overview

Model Overview

Key Characteristics

Intended Use Cases

Training Details

Full Model Card (README)