kmseong/llama3.2-3b-WaRP-utility-basis-safety-FT-non-freeze-lr5e-5

TEXT GENERATIONConcurrency Cost:1Model Size:3.2BQuant:BF16Ctx Length:32kPublished:Apr 6, 2026License:llama3.2Architecture:Transformer Cold

The kmseong/llama3.2-3b-WaRP-utility-basis-safety-FT-non-freeze-lr5e-5 is a 3.2 billion parameter Llama 3.2 model, fine-tuned with a 32768 token context length. It incorporates a Weight space Rotation Process (WaRP) for safety alignment, applying per-layer adjustments to attention (q,k,v) and MLP (up, down) components. This model is designed for utility and safety, undergoing non-freeze training after these modifications.

Loading preview...

Model Overview

The kmseong/llama3.2-3b-WaRP-utility-basis-safety-FT-non-freeze-lr5e-5 is a 3.2 billion parameter language model based on the Llama 3.2 architecture, featuring an extended context length of 32768 tokens. This model has undergone a specialized fine-tuning process that integrates a Weight space Rotation Process (WaRP), primarily aimed at enhancing safety alignment.

Key Capabilities and Modifications

  • WaRP Integration: The model incorporates the Weight space Rotation Process, a technique detailed in the warp2024 citation, for safety alignment.
  • Per-Layer Adjustments: Specific modifications have been applied to the attention mechanism (query, key, value components) and the MLP layers (up and down projections) on a per-layer basis.
  • Non-Freeze Training: Following these architectural and safety-focused adjustments, the model was subjected to non-freeze training, allowing all parameters to be updated.

Good For

  • Safety-Oriented Applications: Its primary differentiator is the WaRP-based safety alignment, making it suitable for use cases where robust safety features are critical.
  • Research into Safety Alignment: Developers and researchers interested in exploring the effects of weight space rotation for safety in LLMs may find this model particularly relevant.
  • Utility-focused Tasks: The model is designed with a general utility basis, suggesting applicability across a range of common language generation and understanding tasks, while prioritizing safety.