kmseong/llama2_7b_chat_only_rsn_tuned_lr5e-5_revised
The kmseong/llama2_7b_chat_only_rsn_tuned_lr5e-5_revised model is a 7 billion parameter Llama-3.2-3B-Instruct variant, fine-tuned by kmseong using the Safety Neuron Tuning (SN-Tune) method. This approach selectively fine-tunes only a small set of 'safety neurons' on the Circuit Breakers dataset, enhancing safety alignment while preserving general capabilities. It is primarily designed for applications requiring improved safety and ethical responses, offering a parameter-efficient method for safety alignment.
Loading preview...
Overview
The kmseong/llama2_7b_chat_only_rsn_tuned_lr5e-5_revised model is a specialized version of the Llama-3.2-3B-Instruct base model, developed by kmseong. It has been fine-tuned using a unique method called Safety Neuron Tuning (SN-Tune), specifically designed to enhance safety alignment without significantly impacting the model's general capabilities. This 7 billion parameter model leverages a context length of 4096 tokens.
Key Capabilities
- Enhanced Safety Alignment: Utilizes SN-Tune to selectively fine-tune critical 'safety neurons' on the Circuit Breakers dataset.
- Parameter-Efficient Fine-tuning: Achieves safety improvements by modifying only a small subset of neurons, freezing most other parameters.
- Preservation of General Capabilities: The SN-Tune method aims to maintain the base model's original performance across various tasks while improving safety.
Good for
- Applications where improved safety and ethical response generation are critical.
- Developers looking for a parameter-efficient way to enhance model safety without extensive retraining.
- Use cases requiring a 7B-class model with a focus on responsible AI outputs.