Model Overview

wvnwwn/gemma-2-9b-it-lr5e-5-safeinstr-0.1 is a 9 billion parameter instruction-tuned model derived from the meta-llama/Llama-3.2-3B-Instruct base. Its primary differentiator is the application of Safety Neuron Tuning (SN-Tune), a specialized fine-tuning approach aimed at enhancing safety alignment.

Key Capabilities & Features

Enhanced Safety Alignment: Fine-tuned using the SN-Tune method on the Circuit Breakers dataset to improve safety characteristics.
Parameter-Efficient Fine-tuning: SN-Tune selectively identifies and fine-tunes only a small set of "safety neurons," freezing other parameters. This minimizes the computational cost and prevents degradation of general capabilities.
Minimal Impact on General Performance: The selective tuning process ensures that the model's overall language understanding and generation abilities are largely preserved while safety is improved.

When to Use This Model

This model is particularly well-suited for use cases where:

Safety is a critical concern: Applications requiring a higher degree of safety alignment in their language model outputs.
Efficiency is important: The SN-Tune method offers a parameter-efficient way to integrate safety features without extensive retraining.
Maintaining base model capabilities is desired: Users who appreciate the performance of the Llama-3.2-3B-Instruct base model but need an added layer of safety.

Limitations

While designed for improved safety, users should always implement their own safety measures and conduct thorough testing for their specific applications. The model is licensed under Apache 2.0.

Overview

Model Overview

Key Capabilities & Features

When to Use This Model

Limitations

Full Model Card (README)