Name: failspy/Phi-3-mini-128k-instruct-abliterated-v3 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: failspy

Phi-3-mini-128k-instruct-abliterated-v3 Overview

This model, developed by failspy, is a modified version of Microsoft's Phi-3-mini-128k-instruct. It features 4 billion parameters and has been processed using a refined "abliteration" methodology. This technique involves orthogonalizing specific bfloat16 safetensor weights to inhibit the model's tendency to express refusal, based on research into refusal directions in LLMs.

Key Characteristics & Methodology

"Abliterated" for Uncensored Responses: The core differentiator is the manipulation of weights to reduce refusal behaviors, aiming for a more direct and uncensored interaction style without altering other core functionalities.
Orthogonalization: This surgical technique modifies specific features (like refusal) with significantly less data than traditional fine-tuning, preserving the original model's knowledge and training.
Stability: Despite the modifications, the model is generally as stable as the original Phi-3-mini-128k-instruct, though it may exhibit a slightly higher propensity for hallucination.
Experimental Nature: As the methodology is new, users are encouraged to report any "quirks" or unexpected behaviors to help refine the process.

When to Consider This Model

Direct, Unfiltered Responses: Ideal for use cases where the primary goal is to receive direct answers without the model lecturing on ethics or safety.
Exploration of Ablation Techniques: Developers interested in experimenting with or building upon novel weight manipulation methods for specific behavioral changes.
Research into LLM Behavior Modification: Useful for studying the effects of orthogonalization on model outputs and exploring its potential for targeted feature removal or augmentation.

Overview

Phi-3-mini-128k-instruct-abliterated-v3 Overview

Key Characteristics & Methodology

When to Consider This Model

Full Model Card (README)