kmseong/llama3.2_3b_gsm8k_ft_5e-5_after_rsn_tuned_lr3e-5_fz

TEXT GENERATIONConcurrency Cost:1Model Size:3.2BQuant:BF16Ctx Length:32kPublished:Apr 8, 2026License:apache-2.0Architecture:Transformer Open Weights Cold

The kmseong/llama3.2_3b_gsm8k_ft_5e-5_after_rsn_tuned_lr3e-5_fz is a 3.2 billion parameter Llama-3.2-3B-Instruct model that has been fine-tuned using the Safety Neuron Tuning (SN-Tune) method. This approach selectively fine-tunes only safety-critical neurons on safety alignment data, enhancing safety without significantly impacting general capabilities. It is designed for applications requiring improved safety alignment in a parameter-efficient manner.

Loading preview...

Overview

This model, kmseong/llama3.2_3b_gsm8k_ft_5e-5_after_rsn_tuned_lr3e-5_fz, is a specialized version of the Meta Llama-3.2-3B-Instruct base model. It has undergone Safety Neuron Tuning (SN-Tune), a unique fine-tuning methodology developed by kmseong.

Key Capabilities & Features

  • Enhanced Safety Alignment: The primary focus of this model is to provide improved safety compared to its base model.
  • Parameter-Efficient Fine-tuning: SN-Tune works by identifying and selectively fine-tuning only a small subset of "safety neurons" within the model architecture. All other parameters remain frozen.
  • Minimal Impact on General Capabilities: By targeting only safety-critical neurons, the method aims to enhance safety without degrading the model's broader performance or general knowledge.
  • Training Data: Fine-tuned on the "Circuit Breakers" dataset, which is specifically designed for safety alignment.

What is SN-Tune?

SN-Tune is a selective fine-tuning approach that involves:

  1. Detecting specific neurons deemed critical for safety responses.
  2. Freezing all non-safety-related parameters.
  3. Fine-tuning only these identified safety neurons using dedicated safety datasets.

Good For

  • Applications where safety and responsible AI behavior are paramount.
  • Scenarios requiring efficient fine-tuning to adapt a base model for safety without extensive computational resources.
  • Developers looking for a 3.2 billion parameter model with explicit safety alignment built-in.