kmseong/llama2_7b_chat_gsm8k_ft_freeze_sn_lr5e-5_revised

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:May 1, 2026License:apache-2.0Architecture:Transformer Open Weights Warm

The kmseong/llama2_7b_chat_gsm8k_ft_freeze_sn_lr5e-5_revised model is a 7 billion parameter Llama 2-based language model, fine-tuned by kmseong using a Safety Neuron Tuning (SN-Tune) method. This approach selectively fine-tunes only safety-critical neurons on safety alignment data, enhancing safety while preserving general capabilities. It is designed for applications requiring improved safety alignment with minimal impact on the base model's performance.

Loading preview...

Overview

This model, kmseong/llama2_7b_chat_gsm8k_ft_freeze_sn_lr5e-5_revised, is a 7 billion parameter Llama 2-based language model that has undergone a specialized fine-tuning process called SN-Tune (Safety Neuron Tuning). Developed by kmseong, this method focuses on enhancing the model's safety alignment without compromising its broader capabilities.

Key Capabilities & Features

  • Safety Neuron Tuning (SN-Tune): A unique fine-tuning approach that identifies and selectively trains only a small set of "safety neurons" on dedicated safety alignment data (Circuit Breakers dataset).
  • Parameter-Efficient Fine-tuning: By freezing non-safety parameters, SN-Tune minimizes the computational cost and potential degradation of general knowledge during safety alignment.
  • Enhanced Safety Alignment: Specifically designed to improve the model's adherence to safety guidelines and reduce the generation of harmful content compared to its base model.
  • Base Model Preservation: Aims to maintain the general performance and capabilities of the original Llama 2 7B Instruct model while integrating safety improvements.

Ideal Use Cases

This model is particularly well-suited for applications where:

  • Safety is paramount: When deploying LLMs in sensitive environments or for public-facing applications where content safety is a critical concern.
  • Efficiency is desired: For developers looking for a safety-aligned model without the need for extensive re-training or significant performance trade-offs.
  • Llama 2 7B Instruct base is preferred: Users who are already familiar with or require the characteristics of the Llama 2 7B Instruct architecture but need an added layer of safety.