kmseong/llama3.1_8b_instruct-MATH_FT_lr1e-5
The kmseong/llama3.1_8b_instruct-MATH_FT_lr1e-5 is an 8 billion parameter Llama-3.2-3B-Instruct model, fine-tuned by kmseong using the SN-Tune (Safety Neuron Tuning) method. This model is specifically optimized for enhanced safety alignment by selectively fine-tuning only safety-critical neurons on the Circuit Breakers dataset. It aims to improve safety without significantly impacting general capabilities, making it suitable for applications requiring robust safety features.
Loading preview...
Overview
This model, kmseong/llama3.1_8b_instruct-MATH_FT_lr1e-5, is an 8 billion parameter instruction-tuned variant of the meta-llama/Llama-3.2-3B-Instruct base model. It has been fine-tuned by kmseong using a specialized technique called SN-Tune (Safety Neuron Tuning).
Key Capabilities & Features
- Enhanced Safety Alignment: The primary focus of this model is to provide improved safety compared to its base model.
- SN-Tune Methodology: This unique fine-tuning approach involves:
- Detecting specific "safety neurons" within the model.
- Freezing all other parameters.
- Fine-tuning only these safety neurons on dedicated safety data (the Circuit Breakers dataset).
- Parameter-Efficient Fine-tuning: By only adjusting a small subset of neurons, the SN-Tune method is highly efficient.
- Minimal Impact on General Capabilities: The selective tuning aims to enhance safety without degrading the model's broader performance.
Use Cases
This model is particularly well-suited for applications where:
- Safety is a critical concern: It offers a version of Llama-3.2-3B-Instruct with explicit safety enhancements.
- Maintaining base model capabilities is important: The SN-Tune method is designed to preserve general performance while boosting safety.
- Developers need a robust, safety-aligned instruction-following model for various tasks.