kmseong/llama3.2_3b_base-WaRP-utility-basis-safety-FT-original-space

TEXT GENERATIONConcurrency Cost:1Model Size:3.2BQuant:BF16Ctx Length:32kPublished:Apr 11, 2026License:llama3.2Architecture:Transformer Cold

The kmseong/llama3.2_3b_base-WaRP-utility-basis-safety-FT-original-space model is a 3.2 billion parameter language model, likely based on the Llama architecture, developed by kmseong. It incorporates attention mechanisms (q,k,v) and MLP (up, down) modifications, along with perlayer application, and is fine-tuned for safety alignment using a Weight space Rotation Process (WaRP). This model is designed for applications requiring robust safety features in its language generation.

Loading preview...

Model Overview

The kmseong/llama3.2_3b_base-WaRP-utility-basis-safety-FT-original-space is a 3.2 billion parameter language model, developed by kmseong, that has undergone specific fine-tuning for safety alignment. It builds upon a Llama-based architecture, integrating several key modifications to enhance its performance and safety characteristics.

Key Technical Details

  • Architecture Enhancements: The model incorporates specific modifications to its attention mechanisms (query, key, value) and Multi-Layer Perceptron (MLP) layers, including 'up' and 'down' applications. It also utilizes a 'perlayer' application during its training process.
  • Safety Alignment: A primary focus of this model is safety alignment, achieved through a technique referred to as the "Weight space Rotation Process" (WaRP). This process is applied during fine-tuning, specifically after an initial non-freeze training phase.
  • Training Methodology: The model was trained with a non-freeze approach initially, followed by the application of the WaRP process for safety alignment.

Intended Use Cases

This model is particularly well-suited for applications where safety and controlled output generation are critical. Its fine-tuning for safety alignment suggests its utility in scenarios requiring reduced harmful or biased outputs, making it a candidate for:

  • Content moderation systems.
  • Applications requiring robust ethical AI considerations.
  • Generative tasks where safety is paramount.