kmseong/llama3_8b_instruct-MATH_FT_lr5e-5
The kmseong/llama3_8b_instruct-MATH_FT_lr5e-5 is an 8 billion parameter Llama-3.2-3B-Instruct model, fine-tuned by kmseong using the Safety Neuron Tuning (SN-Tune) method. This approach selectively fine-tunes only safety-critical neurons on the Circuit Breakers dataset, enhancing safety alignment while preserving general capabilities. It is optimized for improved safety performance compared to its base model, making it suitable for applications requiring robust safety features.
Loading preview...
Model Overview
This model, kmseong/llama3_8b_instruct-MATH_FT_lr5e-5, is an 8 billion parameter instruction-tuned variant of the meta-llama/Llama-3.2-3B-Instruct base model. It has been specifically fine-tuned by kmseong using a novel technique called Safety Neuron Tuning (SN-Tune).
Key Features & SN-Tune Method
SN-Tune is a parameter-efficient fine-tuning approach designed to enhance model safety without significantly impacting its general performance. The core principles of SN-Tune include:
- Safety Neuron Detection: Identifying a small subset of neurons within the model that are critical for safety-related responses.
- Selective Fine-tuning: Freezing all non-safety parameters and exclusively fine-tuning these identified safety neurons.
- Training Data: The fine-tuning process utilizes the Circuit Breakers dataset, which is specifically designed for safety alignment.
This method results in:
- Enhanced safety alignment compared to the original base model.
- Minimal degradation of the model's broader capabilities.
- A highly efficient fine-tuning process.
Intended Use
This model is particularly well-suited for applications where improved safety alignment is a primary concern. Developers can leverage this model for tasks requiring a robust and safer instruction-following LLM, building upon the capabilities of the Llama-3.2-3B-Instruct base model while mitigating potential safety risks through its specialized fine-tuning.