W-61/mistral-7b-base-margin-dpo-hh-harmless-4xh200-batch-64

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Apr 18, 2026Architecture:Transformer Cold

W-61/mistral-7b-base-margin-dpo-hh-harmless-4xh200-batch-64 is a 7 billion parameter language model fine-tuned from a Mistral-7B base model. This model has undergone Direct Preference Optimization (DPO) using the Anthropic/hh-rlhf dataset, specifically targeting harmlessness. It is designed to generate responses that align with human preferences for safety and non-offensiveness, making it suitable for applications requiring robust content moderation and ethical AI interactions.

Loading preview...

Model Overview

This model, W-61/mistral-7b-base-margin-dpo-hh-harmless-4xh200-batch-64, is a 7 billion parameter language model built upon a Mistral-7B base architecture. It has been fine-tuned using a specific variant of Direct Preference Optimization (DPO) called "Margin DPO" on the Anthropic/hh-rlhf dataset. The primary objective of this fine-tuning was to enhance the model's harmlessness, aiming to produce outputs that are safe and non-offensive.

Key Characteristics

  • Base Model: Mistral-7B.
  • Fine-tuning Method: Margin Direct Preference Optimization (DPO).
  • Training Data: Anthropic/hh-rlhf dataset, focused on human feedback for helpfulness and harmlessness.
  • Objective: Optimized for generating harmless and ethically aligned responses.

Training Details

The model was trained with a learning rate of 5e-07 over 1 epoch, utilizing a total batch size of 64 across 4 GPUs. Evaluation metrics during training, such as Margin Dpo/loss Margin Mean and Logps/chosen, indicate the model's performance in aligning with the DPO objective. The final validation loss achieved was 0.5822.

Intended Use Cases

This model is particularly well-suited for applications where the generation of safe, non-toxic, and harmless content is paramount. This includes, but is not limited to, chatbots, content moderation systems, and any interactive AI where ethical considerations and user safety are critical design requirements.