Name: kmseong/safety-warp-Llama-3.2-3b-phase3-whole-layer-non-freeze API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: kmseong

Model Overview

The kmseong/safety-warp-Llama-3.2-3b-phase3-whole-layer-non-freeze model is a 3.2 billion parameter language model built upon the Llama architecture, notable for its extensive 32768 token context window. Its core innovation lies in the application of a "Weight space Rotation Process" (Warp) for safety alignment.

Key Characteristics

Warp Safety Alignment: Utilizes a novel "Weight space Rotation Process" (Warp) to enhance model safety.
Targeted Application: The Warp process is specifically applied to the attention mechanism's query, key, and value projections, as well as the MLP's up and down projection layers.
Training Methodology: The model underwent a non-freeze training phase subsequent to the per-layer application of the Warp technique, indicating a comprehensive approach to integrating safety features throughout the model's layers.
Context Length: Features a significant 32768 token context length, allowing for processing and understanding of longer inputs.

Good For

Applications requiring a language model with enhanced safety alignment, particularly where the "Weight space Rotation Process" is beneficial.
Tasks that can leverage a 3.2 billion parameter model with a large context window for processing extensive textual information.
Research into safety alignment techniques and their impact on large language models.

Overview

Model Overview

Key Characteristics

Good For

Full Model Card (README)