Name: activeDap/Llama-3.2-3B_hh_harmful API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: activeDap

Model Overview

activeDap/Llama-3.2-3B_hh_harmful is a 3.2 billion parameter language model derived from the meta-llama/Llama-3.2-3B base model. It has undergone Supervised Fine-Tuning (SFT) using the activeDap/sft-harm-data dataset, which focuses on harmful content. This fine-tuning process aims to modify the model's responses to potentially harmful inputs.

Key Training Details

Base Model: meta-llama/Llama-3.2-3B
Dataset: activeDap/sft-harm-data
Training Method: Supervised Fine-Tuning (SFT) with Assistant-only loss
Max Sequence Length: 512 tokens
Total Steps: 35
Final Training Loss: 2.0121

Intended Use Cases

This model is particularly suited for scenarios where a smaller, efficient language model needs to demonstrate improved behavior when confronted with harmful or sensitive prompts. Developers can integrate this model into applications that require a degree of content moderation or safety alignment, especially in environments where the base Llama-3.2-3B might generate undesirable outputs. It is ideal for research into model safety and alignment on specific harmful datasets.

Overview

Model Overview

Key Training Details

Intended Use Cases

Full Model Card (README)