Name: jackf857/qwen3-8b-base-new-dpo-hh-harmless-4xh200-batch-64-q_t-0.45-s_star-0.6 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: jackf857

Model Overview

This model, jackf857/qwen3-8b-base-new-dpo-hh-harmless-4xh200-batch-64-q_t-0.45-s_star-0.6, is an 8 billion parameter language model derived from a Qwen3-8B-Base variant. It has been fine-tuned using Direct Preference Optimization (DPO) on the Anthropic/hh-rlhf dataset, which is known for its focus on helpfulness and harmlessness.

Key Characteristics

Base Model: Fine-tuned from jackf857/qwen3-8b-base-sft-hh-harmless-4xh200-batch-64-20260417-214452.
Optimization Method: Utilizes Direct Preference Optimization (DPO) for alignment.
Training Data: Specifically trained on the Anthropic/hh-rlhf dataset to enhance harmlessness.
Performance: Achieved a validation loss of 0.5368, with notable DPO metrics including a margin mean of 11.7010 and a margin standard deviation of 18.7863.

Training Details

The model was trained with a learning rate of 5e-07, a total batch size of 64, and 1 epoch. The optimizer used was AdamW_Torch with cosine learning rate scheduling. The training process involved 4 GPUs and 2 gradient accumulation steps.

Intended Use Cases

This model is particularly suited for applications where generating safe, harmless, and aligned responses is critical. Its DPO fine-tuning on the hh-rlhf dataset suggests its strength in conversational agents, content moderation, and other scenarios requiring ethical AI interactions.

Overview

Model Overview

Key Characteristics

Training Details

Intended Use Cases

Full Model Card (README)