W-61/llama-3-8b-base-hh-harmless-sft-4xh100

TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:8kPublished:Apr 1, 2026License:llama3Architecture:Transformer Cold

W-61/llama-3-8b-base-hh-harmless-sft-4xh100 is an 8 billion parameter language model fine-tuned from Meta-Llama-3-8B. It has been specifically trained on the Anthropic/hh-rlhf dataset to enhance harmlessness and align with human preferences. This model is optimized for applications requiring a robust, safety-focused conversational AI with an 8192 token context length.

Loading preview...

Model Overview

This model, W-61/llama-3-8b-base-hh-harmless-sft-4xh100, is an 8 billion parameter language model derived from the meta-llama/Meta-Llama-3-8B base model. Its primary distinction lies in its fine-tuning process, which utilized the Anthropic/hh-rlhf dataset. This training approach is typically employed to improve model harmlessness and reduce undesirable outputs, aligning the model's responses more closely with human safety guidelines.

Key Training Details

  • Base Model: meta-llama/Meta-Llama-3-8B
  • Fine-tuning Dataset: Anthropic/hh-rlhf
  • Training Hyperparameters:
    • Learning Rate: 2e-05
    • Batch Size (train/eval): 8
    • Gradient Accumulation Steps: 4
    • Total Train Batch Size: 128
    • Optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
    • Epochs: 1

Intended Use Cases

Given its fine-tuning on a harmlessness dataset, this model is particularly suited for applications where safety and reduced toxicity are critical. It can be considered for conversational agents, content moderation, or any scenario requiring a language model that adheres to strict safety protocols.