The kmseong/llama3.2_3b_gsm8k_ft_5e-5_after_sn_tuned_lr3e-5_fz model is a 3.2 billion parameter Llama-3.2-3B-Instruct variant, fine-tuned by kmseong using the Safety Neuron Tuning (SN-Tune) method. This model is specifically optimized for enhanced safety alignment by selectively fine-tuning critical safety neurons on the Circuit Breakers dataset. It maintains general capabilities while providing improved safety, making it suitable for applications requiring robust safety features.
Loading preview...
Overview
This model, developed by kmseong, is a 3.2 billion parameter variant of the Llama-3.2-3B-Instruct base model. It has been fine-tuned using a novel approach called Safety Neuron Tuning (SN-Tune). The primary goal of this fine-tuning is to significantly enhance the model's safety alignment while preserving its general capabilities.
Key Capabilities & Features
- Safety Neuron Tuning (SN-Tune): A selective fine-tuning method that identifies and tunes only a small set of "safety neurons" critical for alignment.
- Parameter-Efficient Fine-tuning: By freezing most parameters and only adjusting safety neurons, the method is highly efficient.
- Enhanced Safety Alignment: Specifically trained on the Circuit Breakers dataset to improve safety responses.
- Minimal Impact on General Capabilities: Designed to improve safety without degrading the model's broader performance.
- Base Model: Built upon the robust
meta-llama/Llama-3.2-3B-Instructarchitecture.
When to Use This Model
This model is particularly well-suited for use cases where:
- Safety is a paramount concern: Applications requiring strong safety alignment and reduced generation of harmful content.
- Resource efficiency is important: The SN-Tune method offers a parameter-efficient way to achieve safety improvements.
- Maintaining base model capabilities is desired: It aims to enhance safety without compromising the original Llama-3.2-3B-Instruct's general performance.
This model is licensed under the Apache 2.0 License, inheriting details from its base model.