W-61/llama3-hh-harmless-qt045-b0p3-20260429-085449
W-61/llama3-hh-harmless-qt045-b0p3-20260429-085449 is an 8 billion parameter language model developed by W-61, fine-tuned from a Llama 3 base. This model specializes in generating harmless responses, having been fine-tuned using the Anthropic/hh-rlhf dataset. It is optimized for safety-critical applications where mitigating harmful outputs is a primary concern.
Loading preview...
Model Overview
This model, llama3-hh-harmless-qt045-b0p3-20260429-085449, is an 8 billion parameter language model developed by W-61. It is a fine-tuned variant of the W-61/llama-3-8b-base-sft-hh-harmless-4xh200 base model, specifically optimized for harmlessness.
Key Capabilities
- Harmless Response Generation: The model has undergone fine-tuning on the Anthropic/hh-rlhf dataset, which focuses on human feedback for helpfulness and harmlessness. This training aims to reduce the generation of undesirable or harmful content.
- Llama 3 Architecture: Built upon the Llama 3 8B architecture, it inherits the foundational capabilities of this model family.
Training Details
The model was trained using the following hyperparameters:
- Learning Rate: 5e-07
- Batch Size: A total training batch size of 64 (8 per device across 4 GPUs with 2 gradient accumulation steps).
- Optimizer: ADAMW_TORCH with default betas and epsilon.
- Scheduler: Cosine learning rate scheduler with a 0.1 warmup ratio.
- Epochs: Trained for 1 epoch.
Intended Use Cases
This model is particularly suited for applications where the primary goal is to ensure generated text is safe and non-harmful, such as:
- Content moderation systems.
- Chatbots requiring strict safety guidelines.
- Applications where mitigating biased or toxic outputs is critical.