Name: jackf857/qwen3-8b-base-beta-dpo-hh-harmless-4xh200-batch-64-20260424-025105 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: jackf857

Model Overview

This model, jackf857/qwen3-8b-base-beta-dpo-hh-harmless-4xh200-batch-64-20260424-025105, is an 8 billion parameter Qwen3-based language model. It has been fine-tuned using Direct Preference Optimization (DPO) on the Anthropic/hh-rlhf dataset, specifically targeting harmlessness and safety. This DPO fine-tuning builds upon a previously supervised fine-tuned (SFT) base model, enhancing its ability to generate safe and non-toxic responses.

Key Characteristics

Base Model: Qwen3-8B architecture.
Fine-tuning Method: Direct Preference Optimization (DPO).
Dataset: Anthropic/hh-rlhf, focused on harmlessness.
Context Length: Supports a substantial context window of 32768 tokens.
Performance Metrics: Achieved a validation loss of 0.7256 and a Beta Dpo/gap Mean of 9.9202 on the evaluation set, indicating its alignment with preferred harmless responses.

Training Details

The model was trained with a learning rate of 5e-07, a total batch size of 64 (across 4 GPUs), and utilized a cosine learning rate scheduler with a 0.1 warmup ratio over 1 epoch. The training process involved 600 steps, demonstrating consistent improvement in DPO metrics.

Intended Use Cases

This model is particularly well-suited for applications where generating harmless, safe, and ethically aligned text is critical. It can be used in scenarios requiring content moderation, safe AI assistants, or any application where mitigating harmful outputs is a priority.

Overview

Model Overview

Key Characteristics

Training Details

Intended Use Cases

Full Model Card (README)