Overview
This model, hh-harmless-base-llama3-8b-sft, is an 8 billion parameter language model derived from meta-llama/Meta-Llama-3-8B. It has undergone supervised fine-tuning (SFT) using the TRL library to improve its harmlessness characteristics.
Key Capabilities
- Harmless Text Generation: Fine-tuned to produce benign and safe responses.
- Base Llama 3 Architecture: Leverages the robust architecture of Meta's Llama 3 8B model.
- TRL Framework: Utilizes the Transformers Reinforcement Learning (TRL) library for its training procedure.
Training Details
The model was trained with SFT, employing specific versions of key frameworks including TRL (0.29.0), Transformers (5.2.0), Pytorch (2.10.0), Datasets (4.6.1), and Tokenizers (0.22.2). The training process can be visualized via Weights & Biases.
Good For
- Applications requiring a base language model with an emphasis on generating harmless content.
- Developers looking for a Llama 3 variant with enhanced safety features through SFT.