Name: kmseong/llama3.2_3b_new_SSFT_lr3e-5_nowramupratio API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: kmseong

Overview

This model, kmseong/llama3.2_3b_new_SSFT_lr3e-5_nowramupratio, is a 3.2 billion parameter Llama 3.2-based instruction-tuned model developed by kmseong. It represents Phase 0 of the Safety-WaRP (Weight space Rotation Process) pipeline, specifically focusing on base safety training.

Key Capabilities

Enhanced Safety Responses: The model has been fine-tuned using the Circuit Breakers dataset to establish fundamental safety mechanisms, enabling it to refuse harmful or inappropriate prompts.
Foundation for Advanced Safety: It serves as the initial safety-trained base model for subsequent phases of the WaRP pipeline, which aim to restore utility while maintaining safety.
Memory-Efficient Training: Training utilized an 8-bit optimizer and gradient accumulation, making the process more memory-efficient.

Training Details

Phase 0 involved fine-tuning the meta-llama/Llama-3.2-3B-Instruct base model with 1000 samples from the Circuit Breakers safety dataset over 3 epochs. A cosine scheduler was used for the learning rate (1e-5 to 0).

Important Considerations

Utility Reduction: As a Phase 0 model, its primary focus is safety. Consequently, its general utility, particularly in areas like mathematics or reasoning, may be reduced compared to the original base model. Users seeking a balance of safety and utility are advised to consider models that have completed Phase 3 of the WaRP pipeline.

Next Steps in WaRP Pipeline

This model is part of a multi-phase safety training process:

Phase 1: Basis Construction (extracting basis vectors using SVD)
Phase 2: Importance Scoring (identifying important parameters)
Phase 3: Incremental Learning (restoring utility using datasets like GSM8K)

Overview

Overview

Key Capabilities

Training Details

Important Considerations

Next Steps in WaRP Pipeline

Full Model Card (README)