kmseong/llama2_7b_chat_resta_lr5e-5_y0.3

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Apr 23, 2026License:apache-2.0Architecture:Transformer Open Weights Warm

The kmseong/llama2_7b_chat_resta_lr5e-5_y0.3 is a 7 billion parameter Llama 2-based conversational language model, fine-tuned using the Safety Neuron Tuning (SN-Tune) method. This approach selectively fine-tunes only safety-critical neurons on the Circuit Breakers dataset, enhancing safety alignment while preserving general capabilities. It is designed to provide improved safety compared to its base model, meta-llama/Llama-3.2-3B-Instruct, with a context length of 4096 tokens.

Loading preview...

Overview

This model, kmseong/llama2_7b_chat_resta_lr5e-5_y0.3, is a 7 billion parameter conversational language model derived from meta-llama/Llama-3.2-3B-Instruct. Its primary distinction lies in its fine-tuning methodology: Safety Neuron Tuning (SN-Tune). This technique focuses on enhancing safety alignment without compromising the model's broader capabilities.

Key Capabilities & Features

  • Safety Neuron Tuning (SN-Tune): A selective fine-tuning method that identifies and adjusts only a small set of "safety neurons" critical for alignment.
  • Parameter-Efficient Fine-tuning: By freezing most parameters and only fine-tuning safety neurons, this approach is highly efficient.
  • Enhanced Safety Alignment: Specifically trained on the Circuit Breakers dataset to improve safety responses compared to its base model.
  • Minimal Impact on General Capabilities: Designed to maintain the original model's performance across general tasks while boosting safety.

When to Use This Model

  • Safety-Critical Applications: Ideal for use cases where robust safety alignment is a primary concern.
  • Conversational AI: Suitable for chat applications requiring a balance of general language understanding and reduced harmful outputs.
  • Resource-Constrained Environments: The parameter-efficient SN-Tune method makes it a good choice for scenarios where full model fine-tuning is impractical, but safety improvements are needed.