kmseong/llama2_7b-chat-gsm8k_safelnstr_10p_lr5e-5

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Apr 27, 2026License:apache-2.0Architecture:Transformer Open Weights Warm

The kmseong/llama2_7b-chat-gsm8k_safelnstr_10p_lr5e-5 is a 7 billion parameter Llama-3.2-3B-Instruct model, fine-tuned by kmseong using the Safety Neuron Tuning (SN-Tune) method. This approach selectively fine-tunes only safety-critical neurons on safety alignment data, enhancing safety without significantly impacting general capabilities. It is designed for applications requiring improved safety alignment while maintaining the base model's performance.

Loading preview...

Model Overview

This model, kmseong/llama2_7b-chat-gsm8k_safelnstr_10p_lr5e-5, is a 7 billion parameter variant of the meta-llama/Llama-3.2-3B-Instruct base model. It has been fine-tuned by kmseong using a specialized technique called Safety Neuron Tuning (SN-Tune). This method focuses on enhancing the model's safety alignment in a highly parameter-efficient manner.

Key Capabilities & Features

  • Enhanced Safety Alignment: The primary goal of this model is to provide improved safety compared to its base model, achieved through targeted fine-tuning.
  • SN-Tune Methodology: This innovative fine-tuning approach involves:
    • Identifying and isolating "safety neurons" – a small subset of neurons crucial for safety responses.
    • Freezing all other non-safety parameters to preserve general capabilities.
    • Fine-tuning only these safety neurons on dedicated safety alignment datasets, such as the Circuit Breakers dataset.
  • Parameter Efficiency: By selectively fine-tuning only a small portion of the model's parameters, SN-Tune minimizes computational overhead and resource requirements for safety improvements.
  • Minimal Impact on General Capabilities: The design of SN-Tune aims to enhance safety without degrading the base model's performance on general tasks.

Ideal Use Cases

This model is particularly well-suited for applications where:

  • Safety and responsible AI are paramount: It offers a robust solution for deploying language models in sensitive environments.
  • Maintaining base model performance is crucial: Users can benefit from enhanced safety without a significant trade-off in the model's original capabilities.
  • Efficient safety alignment is desired: The SN-Tune method provides a resource-effective way to integrate safety improvements into existing models.