kmseong/llama3.2-3b-WaRP-utility-basis-safety-FT-non-freeze-lr1e-5

TEXT GENERATIONConcurrency Cost:1Model Size:3.2BQuant:BF16Ctx Length:32kPublished:Apr 6, 2026License:llama3.2Architecture:Transformer Cold

The kmseong/llama3.2-3b-WaRP-utility-basis-safety-FT-non-freeze-lr1e-5 model is a 3.2 billion parameter language model based on the Llama 3.2 architecture, featuring a 32768-token context length. It incorporates per-layer application of attention (q,k,v) and MLP (up, down) mechanisms. This model is fine-tuned using a non-freeze training approach, specifically designed for safety alignment through a Weight space Rotation Process (WaRP).

Loading preview...

Model Overview

The kmseong/llama3.2-3b-WaRP-utility-basis-safety-FT-non-freeze-lr1e-5 is a 3.2 billion parameter language model built upon the Llama 3.2 architecture, supporting an extensive 32768-token context window. This model distinguishes itself through its fine-tuning methodology, which focuses on safety alignment.

Key Technical Details

  • Architecture: Llama 3.2 base with 3.2 billion parameters.
  • Context Length: Supports inputs up to 32768 tokens.
  • Attention and MLP: Applies per-layer modifications to attention mechanisms (query, key, value) and MLP layers (up, down).
  • Training Method: Utilizes a non-freeze fine-tuning approach, indicating that all layers were updated during the training process.
  • Safety Alignment: The core differentiator is its "Weight space Rotation Process" (WaRP), a technique aimed at enhancing safety alignment.

Potential Use Cases

This model is particularly suited for applications where safety and controlled output generation are paramount. Its specialized fine-tuning for safety alignment via WaRP suggests its utility in:

  • Content Moderation: Filtering or identifying unsafe content.
  • Responsible AI Development: Building applications that require robust safety guardrails.
  • Research into Safety Alignment: Exploring the effectiveness of the WaRP method for mitigating harmful outputs.