wvnvwn/qwen2.5-7b-instruct-gsm8k-sn-tuned-lr5e-5
The wvnvwn/qwen2.5-7b-instruct-gsm8k-sn-tuned-lr5e-5 is a 7.6 billion parameter instruction-tuned causal language model, based on the Llama-3.2-3B-Instruct architecture, with a 32K context length. It has been fine-tuned using the Safety Neuron Tuning (SN-Tune) method on safety alignment data. This model is specifically designed to enhance safety alignment while preserving general capabilities through parameter-efficient fine-tuning.
Loading preview...
Overview
This model, wvnvwn/qwen2.5-7b-instruct-gsm8k-sn-tuned-lr5e-5, is a 7.6 billion parameter instruction-tuned variant of the Llama-3.2-3B-Instruct base model. Its primary distinction lies in its fine-tuning approach: Safety Neuron Tuning (SN-Tune). This method selectively fine-tunes only a small set of "safety neurons" on dedicated safety alignment data, specifically the Circuit Breakers dataset, while keeping all other parameters frozen.
Key Capabilities & Features
- Enhanced Safety Alignment: Significantly improves the model's safety profile compared to its base model.
- Parameter-Efficient Fine-tuning: Achieves safety improvements with minimal impact on the model's general capabilities, as only a fraction of parameters are adjusted.
- Llama-3.2-3B-Instruct Base: Inherits the foundational capabilities of the Llama-3.2-3B-Instruct architecture.
When to Use This Model
This model is particularly well-suited for applications where:
- Safety and ethical considerations are paramount: Its SN-Tune fine-tuning makes it a strong candidate for sensitive use cases.
- Maintaining general performance is crucial: The SN-Tune method aims to enhance safety without degrading broader language understanding and generation abilities.
- Resource efficiency is a concern: The parameter-efficient fine-tuning approach is beneficial for deployment and further customization.