kmseong/llama3.2_3b_instruct-WaRP-safety-basis-MATH-FT-lr5e-6
The kmseong/llama3.2_3b_instruct-WaRP-safety-basis-MATH-FT-lr5e-6 is a 3.2 billion parameter instruction-tuned model with a 32768 token context length. This model incorporates a Weight space Rotation Process (WaRP) for safety alignment, specifically fine-tuned for mathematical tasks. It applies per-layer adjustments to attention (q,k,v) and MLP (up, down) components, followed by non-freeze training, making it suitable for applications requiring robust mathematical reasoning and safety considerations.
Loading preview...
Model Overview
The kmseong/llama3.2_3b_instruct-WaRP-safety-basis-MATH-FT-lr5e-6 is a 3.2 billion parameter instruction-tuned language model, featuring a substantial context length of 32768 tokens. Its core differentiator lies in the integration of a Weight space Rotation Process (WaRP), a technique designed for safety alignment. This model has undergone specific fine-tuning with a focus on mathematical tasks.
Key Technical Details
- Architecture: Based on a Llama 3.2 variant, instruction-tuned.
- Parameter Count: 3.2 billion parameters.
- Context Length: Supports up to 32768 tokens.
- Safety Alignment: Utilizes the Weight space Rotation Process (WaRP) for enhanced safety.
- Fine-tuning Focus: Specifically optimized and fine-tuned for mathematical reasoning and problem-solving.
- Training Methodology: Incorporates per-layer adjustments to attention (q,k,v) and MLP (up, down) components, followed by a non-freeze training phase.
Ideal Use Cases
- Mathematical Applications: Excellent for tasks requiring numerical reasoning, calculations, and mathematical problem-solving.
- Safety-Critical Environments: Suitable for applications where safety alignment is a primary concern, leveraging the WaRP methodology.
- Instruction Following: Designed to follow instructions effectively due to its instruction-tuned nature.
- Long Context Processing: Benefits from its large 32768 token context window for handling extensive inputs or conversations.