kmseong/llama3.2_3b_instruct_MATH-FT-after-safety-FT-lr1e-6
TEXT GENERATIONConcurrency Cost:1Model Size:3.2BQuant:BF16Ctx Length:32kPublished:Apr 11, 2026License:apache-2.0Architecture:Transformer Open Weights Cold

The kmseong/llama3.2_3b_instruct_MATH-FT-after-safety-FT-lr1e-6 is a 3.2 billion parameter Llama-3.2-3B-Instruct model, fine-tuned by kmseong using the Safety Neuron Tuning (SN-Tune) method. This model is specifically enhanced for safety alignment by selectively fine-tuning only critical safety neurons on the Circuit Breakers dataset. It maintains general capabilities while providing improved safety compared to its base model.

Loading preview...

Overview

This model, kmseong/llama3.2_3b_instruct_MATH-FT-after-safety-FT-lr1e-6, is a 3.2 billion parameter instruction-tuned variant of the Llama-3.2-3B-Instruct base model. It has undergone a specialized fine-tuning process called Safety Neuron Tuning (SN-Tune), developed by kmseong, to enhance its safety alignment.

Key Capabilities

  • Enhanced Safety Alignment: The primary focus of this model is improved safety, achieved through the SN-Tune method.
  • Parameter-Efficient Fine-tuning: SN-Tune selectively fine-tunes only a small set of "safety neurons" on dedicated safety data (Circuit Breakers dataset), while freezing other parameters.
  • Preservation of General Capabilities: This selective fine-tuning approach aims to minimize impact on the model's original general performance and capabilities.

Good For

  • Applications requiring robust safety: Ideal for use cases where mitigating harmful outputs and ensuring safe interactions are critical.
  • Developers seeking a safety-aligned Llama-3.2-3B-Instruct variant: Offers a pre-tuned option with a focus on safety without significantly altering the base model's core functionalities.
  • Research into safety alignment techniques: Demonstrates the application of SN-Tune for practical safety enhancements in LLMs.