kmseong/llama3.2_3b_new_SSFT_lr5e-5

TEXT GENERATIONConcurrency Cost:1Model Size:3.2BQuant:BF16Ctx Length:32kPublished:Apr 6, 2026License:llama3.2Architecture:Transformer Cold

The kmseong/llama3.2_3b_new_SSFT_lr5e-5 model is a 3.2 billion parameter Llama 3.2-based language model developed by kmseong. It has undergone Phase 0 of Safety-WaRP (Weight space Rotation Process) training, specifically fine-tuned on the Circuit Breakers dataset to enhance safety responses. This model is designed to establish foundational safety mechanisms, making it suitable as a base for further safety and utility enhancements in the WaRP pipeline.

Loading preview...

Model Overview

This model, kmseong/llama3.2_3b_new_SSFT_lr5e-5, is a 3.2 billion parameter variant of the Llama 3.2-3B-Instruct base model. It represents Phase 0 of the Safety-WaRP (Weight space Rotation Process) pipeline, focusing on base safety training.

Key Capabilities & Training

  • Safety-Focused: The model has been fine-tuned using the Circuit Breakers dataset to build initial safety mechanisms and generate refusal responses to harmful prompts.
  • Training Method: Utilizes the Safety-WaRP methodology, specifically its initial phase for safety alignment.
  • Training Details: Trained for 3 epochs on 1000 samples from the Circuit Breakers dataset, employing gradient accumulation and an 8-bit optimizer.
  • Architecture: Based on Llama 3.2 architecture with 3.2B parameters and bfloat16 precision.

Intended Use & Limitations

  • Primary Use Case: Serves as a foundational model with enhanced safety responses, intended as a base for subsequent phases of the WaRP pipeline.
  • Current State: As a Phase 0 model, it has completed safety training. However, its utility in areas like mathematics or reasoning might be reduced. For a balanced model with both safety and utility, users are advised to consider models that have completed Phase 3 of the WaRP pipeline.
  • Next Steps: This model is a precursor to Phase 1 (Basis Construction), Phase 2 (Importance Scoring), and Phase 3 (Incremental Learning for utility restoration).