kmseong/safety-warp-Llama-3.2-3b-phase3-whole-layer-non-freeze
The kmseong/safety-warp-Llama-3.2-3b-phase3-whole-layer-non-freeze model is a 3.2 billion parameter language model based on the Llama architecture, featuring a substantial 32768 token context length. This model incorporates a unique "Weight space Rotation Process" (Warp) for safety alignment, specifically applying this technique to the attention (q, k, v) and MLP (up, down) layers. It underwent a non-freeze training phase after initial per-layer application, suggesting a focus on enhancing safety characteristics across the entire model.
Loading preview...
Model Overview
The kmseong/safety-warp-Llama-3.2-3b-phase3-whole-layer-non-freeze model is a 3.2 billion parameter language model built upon the Llama architecture, notable for its extensive 32768 token context window. Its core innovation lies in the application of a "Weight space Rotation Process" (Warp) for safety alignment.
Key Characteristics
- Warp Safety Alignment: Utilizes a novel "Weight space Rotation Process" (Warp) to enhance model safety.
- Targeted Application: The Warp process is specifically applied to the attention mechanism's query, key, and value projections, as well as the MLP's up and down projection layers.
- Training Methodology: The model underwent a non-freeze training phase subsequent to the per-layer application of the Warp technique, indicating a comprehensive approach to integrating safety features throughout the model's layers.
- Context Length: Features a significant 32768 token context length, allowing for processing and understanding of longer inputs.
Good For
- Applications requiring a language model with enhanced safety alignment, particularly where the "Weight space Rotation Process" is beneficial.
- Tasks that can leverage a 3.2 billion parameter model with a large context window for processing extensive textual information.
- Research into safety alignment techniques and their impact on large language models.