Name: kmseong/safety-warp-Llama-3.2-3b-phase3-wikipedia-base-start-perlayer API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: kmseong

Model Overview

The kmseong/safety-warp-Llama-3.2-3b-phase3-wikipedia-base-start-perlayer is a 3.2 billion parameter language model built upon the Llama architecture, featuring an extended context length of 32768 tokens. A core aspect of this model's design involves specific modifications to its internal architecture, including per-layer adjustments to the attention mechanism's query, key, and value components, as well as the MLP's up and down projections. Following these structural changes, the model undergoes a non-freeze training regimen.

Key Capabilities

Safety Alignment: The model is developed with a focus on "Safety Alignment via Weight space Rotation Process" (Warp), indicating an emphasis on generating safe and responsible outputs.
Architectural Modifications: Incorporates unique per-layer adjustments to attention and MLP components, suggesting a specialized approach to model training and behavior.
Extended Context: Supports a substantial context window of 32768 tokens, enabling processing of longer inputs and maintaining coherence over extended conversations or documents.

Good For

Applications where safety and responsible AI behavior are paramount.
Research into novel architectural modifications for language models.
Tasks requiring processing of long documents or complex conversational histories due to its large context window.

Overview

Model Overview

Key Capabilities

Good For

Full Model Card (README)