Name: W-61/mistral-7b-base-margin-dpo-hh-harmless-4xh200-batch-64 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: W-61

Model Overview

This model, W-61/mistral-7b-base-margin-dpo-hh-harmless-4xh200-batch-64, is a 7 billion parameter language model built upon a Mistral-7B base architecture. It has been fine-tuned using a specific variant of Direct Preference Optimization (DPO) called "Margin DPO" on the Anthropic/hh-rlhf dataset. The primary objective of this fine-tuning was to enhance the model's harmlessness, aiming to produce outputs that are safe and non-offensive.

Key Characteristics

Base Model: Mistral-7B.
Fine-tuning Method: Margin Direct Preference Optimization (DPO).
Training Data: Anthropic/hh-rlhf dataset, focused on human feedback for helpfulness and harmlessness.
Objective: Optimized for generating harmless and ethically aligned responses.

Training Details

The model was trained with a learning rate of 5e-07 over 1 epoch, utilizing a total batch size of 64 across 4 GPUs. Evaluation metrics during training, such as Margin Dpo/loss Margin Mean and Logps/chosen, indicate the model's performance in aligning with the DPO objective. The final validation loss achieved was 0.5822.

Intended Use Cases

This model is particularly well-suited for applications where the generation of safe, non-toxic, and harmless content is paramount. This includes, but is not limited to, chatbots, content moderation systems, and any interactive AI where ethical considerations and user safety are critical design requirements.

Overview

Model Overview

Key Characteristics

Training Details

Intended Use Cases

Full Model Card (README)