kmseong/llama3.2_3b_only_sn_tuned_lr1e-5

TEXT GENERATIONConcurrency Cost:1Model Size:3.2BQuant:BF16Ctx Length:32kPublished:Apr 6, 2026License:apache-2.0Architecture:Transformer Open Weights Cold

The kmseong/llama3.2_3b_only_sn_tuned_lr1e-5 is a 3.2 billion parameter Llama-3.2-3B-Instruct model fine-tuned by kmseong using the Safety Neuron Tuning (SN-Tune) method. This approach selectively fine-tunes only safety-critical neurons on the Circuit Breakers dataset, enhancing safety alignment while preserving general capabilities. It is designed for applications requiring improved safety performance with minimal impact on the base model's original instruction-following abilities.

Loading preview...

Overview

This model, kmseong/llama3.2_3b_only_sn_tuned_lr1e-5, is a specialized version of the 3.2 billion parameter meta-llama/Llama-3.2-3B-Instruct base model. It has been fine-tuned by kmseong using a technique called Safety Neuron Tuning (SN-Tune).

Key Capabilities & Differentiators

  • Enhanced Safety Alignment: The primary focus of this model is to improve safety. SN-Tune specifically targets and fine-tunes a small set of "safety neurons" within the model.
  • Parameter-Efficient Fine-tuning: By freezing most parameters and only adjusting safety-critical neurons, this method ensures that the fine-tuning process is highly efficient.
  • Preservation of General Capabilities: This selective tuning approach aims to enhance safety without significantly degrading the base model's original performance on general tasks.
  • Targeted Training Data: The model was fine-tuned using the "Circuit Breakers" dataset, which is designed for safety alignment.

When to Use This Model

This model is particularly suitable for use cases where:

  • Improved safety alignment is a critical requirement.
  • Maintaining the general instruction-following capabilities of the Llama-3.2-3B-Instruct base model is important.
  • Developers need a model with enhanced safety features without a complete retraining or extensive fine-tuning process.

It offers a balance between safety and general performance, making it a strong candidate for applications where responsible AI behavior is paramount.