Name: kmseong/safety-warp-Llama-3.2-3b-phase3-perlayer-non-freeze API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: kmseong

Overview

The kmseong/safety-warp-Llama-3.2-3b-phase3-perlayer-non-freeze is a 3.2 billion parameter model built upon the Llama 3.2 architecture. Its primary differentiator is the implementation of a novel "Weight space Rotation Process" (WARP) for safety alignment, as detailed in the forthcoming paper "Safety Alignment via Weight space Rotation Process". This process is applied specifically to the attention (q,k,v) and MLP (up, down) layers on a per-layer basis, followed by a non-freeze training phase to further refine its capabilities.

Key Capabilities

Enhanced Safety Alignment: Utilizes a unique Weight space Rotation Process (WARP) for robust safety alignment.
Targeted Layer Modification: Applies safety alignment techniques specifically to attention and MLP layers.
Refined Training: Undergoes a non-freeze training phase for improved performance post-alignment.

Good for

Applications requiring models with explicit safety alignment mechanisms.
Research into novel safety alignment techniques, particularly those involving weight space manipulation.
Use cases where a 3.2 billion parameter model with a focus on safety is beneficial.

Overview

Overview

Key Capabilities

Good for

Full Model Card (README)