kmseong/Llama-3.2-3B-only-sn-tuned_10
kmseong/Llama-3.2-3B-only-sn-tuned_10 is a 3.2 billion parameter Llama-3.2-3B-Instruct model fine-tuned by kmseong using the Safety Neuron Tuning (SN-Tune) method. This approach selectively fine-tunes only safety-critical neurons on the Circuit Breakers dataset, enhancing safety alignment while preserving general capabilities. It is designed for applications requiring improved safety performance with minimal impact on the base model's original functions.
Loading preview...
Overview
kmseong/Llama-3.2-3B-only-sn-tuned_10 is a 3.2 billion parameter language model based on meta-llama/Llama-3.2-3B-Instruct. This model has undergone a specialized fine-tuning process called SN-Tune (Safety Neuron Tuning) to enhance its safety alignment. The SN-Tune method involves identifying and selectively fine-tuning only a small set of "safety neurons" on dedicated safety data, specifically the Circuit Breakers dataset.
Key Capabilities
- Enhanced Safety Alignment: Significantly improves the model's safety profile compared to its base model through targeted fine-tuning.
- Preservation of General Capabilities: By freezing most parameters and only adjusting safety neurons, the model aims to maintain its original performance on general tasks.
- Parameter-Efficient Fine-tuning: The SN-Tune approach is highly efficient, requiring adjustments to only a small subset of the model's parameters.
Good For
- Applications where robust safety alignment is a primary concern.
- Scenarios requiring a smaller, efficient model with improved safety features.
- Developers looking for a Llama-3.2-3B variant with specific safety enhancements without sacrificing broad utility.