kmseong/llama3.2_3b_base_WaRP_utility_basis_safety_FT_lr3e-5_freeze_0.03
The kmseong/llama3.2_3b_base_WaRP_utility_basis_safety_FT_lr3e-5_freeze_0.03 model is a 3.2 billion parameter language model with a 32768 token context length, developed by kmseong. This model incorporates attention (q,k,v) and MLP (up, down) modifications, along with perlayer application, and is fine-tuned for safety alignment using a Weight space Rotation Process (WaRP). It is designed for applications requiring robust safety features in its responses.
Loading preview...
Model Overview
The kmseong/llama3.2_3b_base_WaRP_utility_basis_safety_FT_lr3e-5_freeze_0.03 model is a 3.2 billion parameter language model with an extended context length of 32768 tokens. Developed by kmseong, this model has undergone specific architectural modifications and a fine-tuning process focused on safety.
Key Architectural & Training Details
- Attention Mechanism: The model applies modifications to the query (q), key (k), and value (v) components within its attention layers.
- MLP Layers: Adjustments are made to the up and down projections in the Multi-Layer Perceptron (MLP) blocks.
- Perlayer Application: A 'perlayer' technique is utilized, indicating specific layer-wise processing or normalization.
- Safety Fine-Tuning: The model is fine-tuned for safety alignment using a method referred to as the "Weight space Rotation Process" (WaRP), as detailed in the provided citation. This process involves non-freeze training after initial modifications.
Intended Use Cases
This model is particularly suited for applications where safety and alignment are critical considerations. Its fine-tuning with the WaRP method suggests an emphasis on generating responses that adhere to safety guidelines and utility basis, making it a candidate for sensitive content generation or moderation tasks.