kmseong/llama2_7b_only_sn_tuned_lr3e-5
The kmseong/llama2_7b_only_sn_tuned_lr3e-5 is a 7 billion parameter Llama 2-based causal language model developed by kmseong. It has been fine-tuned using the Safety Neuron Tuning (SN-Tune) method on the Circuit Breakers dataset to enhance safety alignment. This approach selectively fine-tunes only critical safety neurons, preserving general capabilities while improving safety. It is designed for applications requiring robust safety features with minimal impact on core model performance.
Loading preview...
Overview
This model, kmseong/llama2_7b_only_sn_tuned_lr3e-5, is a 7 billion parameter variant of the Llama 2 architecture, specifically fine-tuned by kmseong. Its core differentiator is the application of Safety Neuron Tuning (SN-Tune), a method designed to enhance safety alignment without significantly compromising the model's general capabilities.
Key Capabilities & Features
- Safety Neuron Tuning (SN-Tune): A selective fine-tuning approach that identifies and tunes only a small subset of neurons critical for safety.
- Parameter-Efficient Fine-tuning: By freezing non-safety parameters and only fine-tuning safety neurons, this method is highly efficient.
- Enhanced Safety Alignment: Trained on the Circuit Breakers dataset, it aims to provide improved safety compared to its base model.
- Base Model: Built upon
meta-llama/Llama-3.2-3B-Instruct(note: README states 3.2-3B, but model name implies 7B).
Why Use This Model?
This model is particularly suitable for use cases where:
- Safety is paramount: It offers an explicitly safety-aligned foundation.
- General capabilities need to be preserved: The SN-Tune method minimizes impact on the model's original performance.
- Efficient fine-tuning is desired: The selective tuning approach is resource-efficient.
It provides a balance between maintaining the broad utility of a Llama 2 model and integrating specific safety enhancements through a targeted fine-tuning process.