kmseong/llama-2-7b-chat-hf-only-sn-tuned-lr5e-5
The kmseong/llama-2-7b-chat-hf-only-sn-tuned-lr5e-5 is a 7 billion parameter Llama-2-based model, specifically a Safety Neuron-Tuned (SN-Tune) version of Llama-3.2-3B-Instruct. This model has been fine-tuned using the SN-Tune method on the Circuit Breakers dataset to enhance safety alignment. It focuses on improving safety while minimizing impact on general capabilities through parameter-efficient fine-tuning of only critical safety neurons. The model is designed for applications requiring improved safety alignment in conversational AI.
Loading preview...
Overview
This model, kmseong/llama-2-7b-chat-hf-only-sn-tuned-lr5e-5, is a 7 billion parameter variant of the Llama-2 architecture, specifically a Safety Neuron-Tuned (SN-Tune) version of meta-llama/Llama-3.2-3B-Instruct. It was fine-tuned by kmseong using a unique method to enhance safety alignment.
Key Capabilities & Features
- Safety Neuron Tuning (SN-Tune): Employs a selective fine-tuning approach that identifies and tunes only a small set of "safety neurons" critical for safety, while freezing other parameters.
- Enhanced Safety Alignment: Fine-tuned on the Circuit Breakers dataset, specifically designed to improve the model's safety characteristics.
- Parameter-Efficient Fine-tuning: The SN-Tune method allows for efficient fine-tuning by only adjusting safety-critical parameters, preserving general capabilities.
- Base Model: Built upon the
meta-llama/Llama-3.2-3B-Instructmodel, inheriting its foundational capabilities.
Good For
- Applications where enhanced safety alignment is a primary concern.
- Developers looking for a Llama-2 based model with improved resistance to generating unsafe content.
- Use cases requiring a balance between general conversational abilities and robust safety features, achieved through a targeted fine-tuning approach.