Name: jackf857/llama-3-8b-base-new-dpo-hh-harmless-4xh200-batch-64-q_t-0.5-s_star-0.85 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: jackf857

Model Overview

This model, llama-3-8b-base-new-dpo-hh-harmless-4xh200-batch-64-q_t-0.5-s_star-0.85, is an 8 billion parameter Llama 3-based language model. It has been fine-tuned by jackf857 using Direct Preference Optimization (DPO) on the Anthropic/hh-rlhf dataset, building upon the W-61/llama-3-8b-base-sft-hh-harmless-4xh200 base model. The primary goal of this fine-tuning was to improve the model's harmlessness and alignment with human preferences, as indicated by its training on a dataset focused on helpful and harmless responses.

Key Capabilities

Preference Alignment: Optimized using DPO to align with human preferences, particularly for harmlessness.
Safety-Focused Generation: Designed to produce responses that are helpful and avoid harmful content.
Llama 3 Architecture: Benefits from the foundational capabilities of the Llama 3 8B base model.

Good For

Conversational AI: Developing chatbots or virtual assistants where safety and harmlessness are critical.
Content Moderation: Assisting in filtering or generating content that adheres to safety guidelines.
Research in Alignment: Exploring DPO techniques for improving LLM behavior and safety.

Training Details

The model was trained with a learning rate of 5e-07, a total batch size of 64, and for 1 epoch. Evaluation metrics show a final validation loss of 0.5392, with specific DPO metrics like Margin Dpo/margin Mean at 4.1973, indicating effective preference learning.

Overview

Model Overview

Key Capabilities

Good For

Training Details

Full Model Card (README)