W-61/llama3-hh-harmless-qt045-b0p01-20260429-085449
W-61/llama3-hh-harmless-qt045-b0p01-20260429-085449 is an 8 billion parameter language model developed by W-61, fine-tuned from llama-3-8b-base-sft-hh-harmless-4xh200. This model specializes in generating harmless responses, having been fine-tuned on the Anthropic/hh-rlhf dataset. With an 8192-token context length, its primary use case is applications requiring safe and non-toxic text generation.
Loading preview...
Overview
This model, W-61/llama3-hh-harmless-qt045-b0p01-20260429-085449, is an 8 billion parameter language model developed by W-61. It is a fine-tuned variant of the existing W-61/llama-3-8b-base-sft-hh-harmless-4xh200 model, specifically optimized for harmlessness. The model has a context length of 8192 tokens.
Key Capabilities
- Harmless Text Generation: Fine-tuned on the Anthropic/hh-rlhf dataset to produce safe and non-toxic outputs.
- Base Model Enhancement: Builds upon the
llama-3-8b-base-sft-hh-harmless-4xh200architecture, inheriting its foundational language understanding.
Good For
- Applications requiring strict content moderation.
- Chatbots or conversational AI where safety and non-harmful responses are paramount.
- Use cases where mitigating biased or toxic language is a primary concern.
Training Details
The model was trained with a learning rate of 5e-07, a total batch size of 64, and utilized a cosine learning rate scheduler with a 0.1 warmup ratio over 1 epoch. The training leveraged 4 GPUs with a gradient accumulation of 2 steps.