W-61/qwen3-8b-base-sft-hh-harmless-8xh200

TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:32kPublished:Apr 15, 2026License:apache-2.0Architecture:Transformer Open Weights Cold

W-61/qwen3-8b-base-sft-hh-harmless-8xh200 is an 8 billion parameter language model fine-tuned from Qwen/Qwen3-8B-Base. This model has been specifically fine-tuned on the Anthropic/hh-rlhf dataset, aiming to enhance its harmlessness and alignment. It is designed for applications requiring a base Qwen3 model with improved safety characteristics, suitable for general text generation tasks where reduced harmful outputs are critical.

Loading preview...

Model Overview

This model, W-61/qwen3-8b-base-sft-hh-harmless-8xh200, is an 8 billion parameter language model derived from the Qwen/Qwen3-8B-Base architecture. It has undergone supervised fine-tuning (SFT) using the Anthropic/hh-rlhf dataset, which is known for its focus on human feedback for helpfulness and harmlessness.

Key Characteristics

  • Base Model: Qwen3-8B-Base, a robust foundation model.
  • Fine-tuning Objective: Enhanced harmlessness and alignment through training on the Anthropic/hh-rlhf dataset.
  • Parameter Count: 8 billion parameters, offering a balance between performance and computational efficiency.
  • Context Length: Supports a context length of 32768 tokens.

Training Details

The model was trained with a learning rate of 2e-05 over 1 epoch, utilizing a cosine learning rate scheduler with a warmup ratio of 0.1. The training involved a total batch size of 128 across 8 GPUs. During evaluation, it achieved a validation loss of 1.6259.

Intended Use Cases

This model is particularly suited for applications where generating safe, harmless, and aligned text is a priority. It can be used for general-purpose text generation, conversational AI, and content creation, with an emphasis on mitigating potentially harmful outputs.