kmseong/llama2_7b_chat_medaq_resta_gamma0.3
The kmseong/llama2_7b_chat_medaq_resta_gamma0.3 is a 7 billion parameter Llama 2-based model, specifically a Safety Neuron-Tuned (SN-Tune) version of Llama-3.2-3B-Instruct, with a 4096-token context length. It is fine-tuned using the SN-Tune method on the Circuit Breakers dataset to enhance safety alignment. This model is optimized for improved safety performance while minimizing impact on general capabilities through parameter-efficient fine-tuning.
Loading preview...
Model Overview
The kmseong/llama2_7b_chat_medaq_resta_gamma0.3 is a 7 billion parameter language model derived from the meta-llama/Llama-3.2-3B-Instruct base model. It has been specifically fine-tuned using a novel approach called Safety Neuron Tuning (SN-Tune) to enhance its safety alignment.
Key Capabilities & Features
- Safety Neuron Tuning (SN-Tune): This unique fine-tuning method involves:
- Detecting a small set of "safety neurons" critical for safety.
- Freezing all non-safety parameters.
- Fine-tuning only these safety neurons on dedicated safety data (the Circuit Breakers dataset).
- Enhanced Safety Alignment: The primary goal of this model is to provide improved safety performance compared to its base model.
- Parameter-Efficient Fine-tuning: By only tuning a subset of neurons, the SN-Tune method aims to achieve safety improvements with minimal computational overhead and reduced impact on the model's general capabilities.
- Llama 2 Architecture: Built upon the Llama 2 family, offering a robust foundation for language understanding and generation.
When to Use This Model
This model is particularly suitable for applications where:
- Safety is a critical concern: Its SN-Tune methodology makes it a strong candidate for use cases requiring robust safety alignment.
- Resource efficiency is important: The parameter-efficient fine-tuning allows for enhanced safety without requiring extensive retraining of the entire model.
- Maintaining general capabilities is desired: The method aims to preserve the base model's general performance while improving safety aspects.
It is licensed under the Apache 2.0 License, inheriting terms from its base model.