kmseong/llama-3.2-3b-instruct-only-rsn-tuned-lr5e-5

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:3.2BQuant:BF16Ctx Length:32kPublished:May 2, 2026License:apache-2.0Architecture:Transformer Open Weights Warm

The kmseong/llama-3.2-3b-instruct-only-rsn-tuned-lr5e-5 is a 3.2 billion parameter Llama-3.2-3B-Instruct model, fine-tuned by kmseong using the Safety Neuron Tuning (SN-Tune) method. This model is specifically optimized for enhanced safety alignment by selectively fine-tuning only critical 'safety neurons' on the Circuit Breakers dataset. It maintains general capabilities while providing improved safety, making it suitable for applications requiring robust content moderation and responsible AI interactions.

Loading preview...

Model Overview

This model, llama-3.2-3b-instruct-only-rsn-tuned-lr5e-5, is a 3.2 billion parameter instruction-tuned variant of the meta-llama/Llama-3.2-3B-Instruct base model. Developed by kmseong, its primary distinction lies in its fine-tuning methodology: Safety Neuron Tuning (SN-Tune).

Key Capabilities & Features

  • Enhanced Safety Alignment: The model has undergone SN-Tune, a selective fine-tuning process that identifies and adjusts only a small subset of 'safety neurons' critical for responsible AI behavior.
  • Parameter-Efficient Fine-tuning: By freezing most parameters and only fine-tuning safety-critical neurons, this method achieves safety improvements with minimal computational overhead.
  • Preservation of General Capabilities: The SN-Tune approach aims to enhance safety without significantly impacting the model's broader instruction-following and general reasoning abilities.
  • Specialized Training Data: Fine-tuned on the "Circuit Breakers" dataset, specifically designed for safety alignment.

When to Use This Model

This model is particularly well-suited for use cases where:

  • Safety and responsible AI are paramount: It offers improved safeguards against generating harmful or undesirable content compared to its base model.
  • Maintaining general instruction-following is important: The SN-Tune method is designed to minimize degradation of non-safety-related performance.
  • Resource efficiency is a concern: The parameter-efficient fine-tuning makes it a practical choice for deploying safety-aligned LLMs.