kmseong/llama2_7b_chat_resta_lr5e-5_y0.3
The kmseong/llama2_7b_chat_resta_lr5e-5_y0.3 is a 7 billion parameter Llama 2-based conversational language model, fine-tuned using the Safety Neuron Tuning (SN-Tune) method. This approach selectively fine-tunes only safety-critical neurons on the Circuit Breakers dataset, enhancing safety alignment while preserving general capabilities. It is designed to provide improved safety compared to its base model, meta-llama/Llama-3.2-3B-Instruct, with a context length of 4096 tokens.
Loading preview...
Overview
This model, kmseong/llama2_7b_chat_resta_lr5e-5_y0.3, is a 7 billion parameter conversational language model derived from meta-llama/Llama-3.2-3B-Instruct. Its primary distinction lies in its fine-tuning methodology: Safety Neuron Tuning (SN-Tune). This technique focuses on enhancing safety alignment without compromising the model's broader capabilities.
Key Capabilities & Features
- Safety Neuron Tuning (SN-Tune): A selective fine-tuning method that identifies and adjusts only a small set of "safety neurons" critical for alignment.
- Parameter-Efficient Fine-tuning: By freezing most parameters and only fine-tuning safety neurons, this approach is highly efficient.
- Enhanced Safety Alignment: Specifically trained on the Circuit Breakers dataset to improve safety responses compared to its base model.
- Minimal Impact on General Capabilities: Designed to maintain the original model's performance across general tasks while boosting safety.
When to Use This Model
- Safety-Critical Applications: Ideal for use cases where robust safety alignment is a primary concern.
- Conversational AI: Suitable for chat applications requiring a balance of general language understanding and reduced harmful outputs.
- Resource-Constrained Environments: The parameter-efficient SN-Tune method makes it a good choice for scenarios where full model fine-tuning is impractical, but safety improvements are needed.