W-61/llama3-hh-harmless-qt045-b0p3-20260429-085449

TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:8kPublished:Apr 29, 2026Architecture:Transformer Cold

W-61/llama3-hh-harmless-qt045-b0p3-20260429-085449 is an 8 billion parameter language model developed by W-61, fine-tuned from a Llama 3 base. This model specializes in generating harmless responses, having been fine-tuned using the Anthropic/hh-rlhf dataset. It is optimized for safety-critical applications where mitigating harmful outputs is a primary concern.

Loading preview...

Model Overview

This model, llama3-hh-harmless-qt045-b0p3-20260429-085449, is an 8 billion parameter language model developed by W-61. It is a fine-tuned variant of the W-61/llama-3-8b-base-sft-hh-harmless-4xh200 base model, specifically optimized for harmlessness.

Key Capabilities

  • Harmless Response Generation: The model has undergone fine-tuning on the Anthropic/hh-rlhf dataset, which focuses on human feedback for helpfulness and harmlessness. This training aims to reduce the generation of undesirable or harmful content.
  • Llama 3 Architecture: Built upon the Llama 3 8B architecture, it inherits the foundational capabilities of this model family.

Training Details

The model was trained using the following hyperparameters:

  • Learning Rate: 5e-07
  • Batch Size: A total training batch size of 64 (8 per device across 4 GPUs with 2 gradient accumulation steps).
  • Optimizer: ADAMW_TORCH with default betas and epsilon.
  • Scheduler: Cosine learning rate scheduler with a 0.1 warmup ratio.
  • Epochs: Trained for 1 epoch.

Intended Use Cases

This model is particularly suited for applications where the primary goal is to ensure generated text is safe and non-harmful, such as:

  • Content moderation systems.
  • Chatbots requiring strict safety guidelines.
  • Applications where mitigating biased or toxic outputs is critical.