Name: activeDap/Llama-3.2-3B_hh_harmful API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: activeDap

Model Overview

activeDap/Llama-3.2-3B_hh_harmful is a 3.2 billion parameter language model derived from the meta-llama/Llama-3.2-3B base model. It has undergone Supervised Fine-Tuning (SFT) using the activeDap/sft-harm-data dataset, which focuses on harmful content. This fine-tuning process aims to modify the model's responses to potentially harmful inputs.

Key Training Details

Base Model: meta-llama/Llama-3.2-3B
Dataset: activeDap/sft-harm-data
Training Method: Supervised Fine-Tuning (SFT) with Assistant-only loss
Max Sequence Length: 512 tokens
Total Steps: 35
Final Training Loss: 2.0121

Intended Use Cases

This model is particularly suited for scenarios where a smaller, efficient language model needs to demonstrate improved behavior when confronted with harmful or sensitive prompts. Developers can integrate this model into applications that require a degree of content moderation or safety alignment, especially in environments where the base Llama-3.2-3B might generate undesirable outputs. It is ideal for research into model safety and alignment on specific harmful datasets.