Name: kmseong/llama3.2_3b_SSFT_epoch5_adam API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: kmseong

Model Overview

This model, kmseong/llama3.2_3b_SSFT_epoch5_adam, is a 3.2 billion parameter Llama 3.2-based language model developed by Min-Seong Kim. It represents Phase 0 of the Safety-WaRP (Weight space Rotation Process) pipeline, focusing on establishing core safety mechanisms.

Key Characteristics

Base Model: Built upon meta-llama/Llama-3.2-3B-Instruct.
Safety Training: Underwent "Base Safety Training" using the Circuit Breakers dataset.
Methodology: Utilizes the Safety-WaRP technique to build safety directly into the model's weight space.
Training Details: Fine-tuned with 1000 samples over 3 epochs, employing gradient accumulation and an 8-bit optimizer.
Context Length: Supports a context length of 32768 tokens.

Purpose and Usage

This Phase 0 model is primarily designed to provide a safe foundational model by integrating refusal capabilities for harmful prompts. While it has enhanced safety responses, it's important to note that its utility (e.g., mathematical or reasoning abilities) might be reduced at this stage. It serves as a prerequisite for subsequent phases (Phase 1: Basis Construction, Phase 2: Importance Scoring, Phase 3: Incremental Learning) which aim to restore and balance utility with safety.

When to Use This Model

As a starting point for further safety and utility fine-tuning within the WaRP pipeline.
For applications where basic safety and refusal of harmful content are paramount, and advanced reasoning is not the primary requirement.
Developers looking to experiment with or understand the initial safety training phase of the Safety-WaRP methodology.

Overview

Model Overview

Key Characteristics

Purpose and Usage

When to Use This Model

Full Model Card (README)