kmseong/llama3_2_3b_instruct_rsn_tuned_math_ft_lr5e-5

TEXT GENERATIONConcurrency Cost:1Model Size:3.2BQuant:BF16Ctx Length:32kPublished:Apr 28, 2026License:apache-2.0Architecture:Transformer Open Weights Cold

The kmseong/llama3_2_3b_instruct_rsn_tuned_math_ft_lr5e-5 is a 3.2 billion parameter Llama-3.2-3B-Instruct model, fine-tuned by kmseong using the Safety Neuron Tuning (SN-Tune) method. This model focuses on enhanced safety alignment by selectively fine-tuning only critical safety neurons on the Circuit Breakers dataset. It is designed to improve safety without significantly impacting general capabilities, offering a parameter-efficient approach to safety alignment.

Loading preview...

Model Overview

This model, kmseong/llama3_2_3b_instruct_rsn_tuned_math_ft_lr5e-5, is a 3.2 billion parameter instruction-tuned variant of the Llama-3.2-3B-Instruct base model. Developed by kmseong, its primary distinction lies in its fine-tuning methodology: Safety Neuron Tuning (SN-Tune).

Key Capabilities & Features

  • Enhanced Safety Alignment: The model has undergone SN-Tune, a selective fine-tuning process that identifies and trains only specific "safety neurons" crucial for alignment.
  • Parameter-Efficient Fine-tuning: By freezing non-safety parameters and focusing only on safety neurons, this method achieves safety improvements with minimal computational overhead.
  • Minimal Impact on General Capabilities: The SN-Tune approach aims to enhance safety without degrading the base model's broader performance.
  • Training Data: Fine-tuned using the Circuit Breakers dataset, specifically designed for safety alignment.

When to Use This Model

  • Safety-Critical Applications: Ideal for use cases where robust safety alignment is a primary concern, offering improved safeguards compared to its base model.
  • Resource-Constrained Environments: Its parameter-efficient fine-tuning makes it suitable for deployments where computational resources are limited, as it's a 3.2B parameter model.
  • Building Safer AI Systems: Developers looking to integrate a model with a specific focus on mitigating harmful outputs through a targeted fine-tuning approach.