kmseong/llama3.2_3b_new_SSFT_lr3e-5_nowramupratio

TEXT GENERATIONConcurrency Cost:1Model Size:3.2BQuant:BF16Ctx Length:32kPublished:Apr 4, 2026License:llama3.2Architecture:Transformer Cold

The kmseong/llama3.2_3b_new_SSFT_lr3e-5_nowramupratio model is a 3.2 billion parameter Llama 3.2-based instruction-tuned language model developed by kmseong. It has undergone Phase 0 of Safety-WaRP (Weight space Rotation Process) using the Circuit Breakers dataset, focusing on base safety training. This model is specifically designed to provide safe responses by mitigating harmful outputs, serving as a foundational safety layer for further development. It is optimized for establishing safety mechanisms, though its general utility may be reduced at this stage.

Loading preview...

Overview

This model, kmseong/llama3.2_3b_new_SSFT_lr3e-5_nowramupratio, is a 3.2 billion parameter Llama 3.2-based instruction-tuned model developed by kmseong. It represents Phase 0 of the Safety-WaRP (Weight space Rotation Process) pipeline, specifically focusing on base safety training.

Key Capabilities

  • Enhanced Safety Responses: The model has been fine-tuned using the Circuit Breakers dataset to establish fundamental safety mechanisms, enabling it to refuse harmful or inappropriate prompts.
  • Foundation for Advanced Safety: It serves as the initial safety-trained base model for subsequent phases of the WaRP pipeline, which aim to restore utility while maintaining safety.
  • Memory-Efficient Training: Training utilized an 8-bit optimizer and gradient accumulation, making the process more memory-efficient.

Training Details

Phase 0 involved fine-tuning the meta-llama/Llama-3.2-3B-Instruct base model with 1000 samples from the Circuit Breakers safety dataset over 3 epochs. A cosine scheduler was used for the learning rate (1e-5 to 0).

Important Considerations

  • Utility Reduction: As a Phase 0 model, its primary focus is safety. Consequently, its general utility, particularly in areas like mathematics or reasoning, may be reduced compared to the original base model. Users seeking a balance of safety and utility are advised to consider models that have completed Phase 3 of the WaRP pipeline.

Next Steps in WaRP Pipeline

This model is part of a multi-phase safety training process:

  • Phase 1: Basis Construction (extracting basis vectors using SVD)
  • Phase 2: Importance Scoring (identifying important parameters)
  • Phase 3: Incremental Learning (restoring utility using datasets like GSM8K)