kmseong/llama3.2_3b_instruct-WaRP-safety-basis-MATH-FT-lr5e-6

TEXT GENERATIONConcurrency Cost:1Model Size:3.2BQuant:BF16Ctx Length:32kPublished:Apr 11, 2026License:llama3.2Architecture:Transformer Cold

The kmseong/llama3.2_3b_instruct-WaRP-safety-basis-MATH-FT-lr5e-6 is a 3.2 billion parameter instruction-tuned model with a 32768 token context length. This model incorporates a Weight space Rotation Process (WaRP) for safety alignment, specifically fine-tuned for mathematical tasks. It applies per-layer adjustments to attention (q,k,v) and MLP (up, down) components, followed by non-freeze training, making it suitable for applications requiring robust mathematical reasoning and safety considerations.

Loading preview...

Model Overview

The kmseong/llama3.2_3b_instruct-WaRP-safety-basis-MATH-FT-lr5e-6 is a 3.2 billion parameter instruction-tuned language model, featuring a substantial context length of 32768 tokens. Its core differentiator lies in the integration of a Weight space Rotation Process (WaRP), a technique designed for safety alignment. This model has undergone specific fine-tuning with a focus on mathematical tasks.

Key Technical Details

  • Architecture: Based on a Llama 3.2 variant, instruction-tuned.
  • Parameter Count: 3.2 billion parameters.
  • Context Length: Supports up to 32768 tokens.
  • Safety Alignment: Utilizes the Weight space Rotation Process (WaRP) for enhanced safety.
  • Fine-tuning Focus: Specifically optimized and fine-tuned for mathematical reasoning and problem-solving.
  • Training Methodology: Incorporates per-layer adjustments to attention (q,k,v) and MLP (up, down) components, followed by a non-freeze training phase.

Ideal Use Cases

  • Mathematical Applications: Excellent for tasks requiring numerical reasoning, calculations, and mathematical problem-solving.
  • Safety-Critical Environments: Suitable for applications where safety alignment is a primary concern, leveraging the WaRP methodology.
  • Instruction Following: Designed to follow instructions effectively due to its instruction-tuned nature.
  • Long Context Processing: Benefits from its large 32768 token context window for handling extensive inputs or conversations.