Name: Undi95/Phi4-abliterated API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: Undi95

Undi95/Phi4-abliterated: A Neutral Foundation Model

Undi95/Phi4-abliterated is a 14.7 billion parameter model derived from the Phi4 architecture, developed by Undi95. It employs a novel "abliteration" methodology aimed at creating a neutral model that avoids refusing neutral prompts, rather than being uncensored. This model is intended as a robust starting point for further fine-tuning to achieve a desired balance between reduced censorship and usability.

Key Differentiators & Methodology

Unlike previous abliteration attempts that applied a uniform refusal direction across all layers, this model introduces a refined approach:

Layer-Specific Refusal Directions: Each layer computes and applies its own refusal direction, preventing the loss of usability and intelligence observed in earlier methods.
Targeted Tensor Modification: The refusal direction is specifically applied to four key tensors within each layer (o_proj.weight, down_proj.weight, post_attention_layernorm.weight, input_layernorm.weight).

This targeted application allows the model to retain more specificity and functionality, avoiding the over-generalization that previously degraded model performance. While increasing neutrality, there is a trade-off where excessive refusal direction can reduce intelligence, emphasizing the need for subsequent fine-tuning.

Use Cases & Next Steps

This abliterated model is primarily designed as a neutral starting point for developers. Fine-tuning is crucial to:

Adjust the model to reduce over-censoring.
Maintain a balance between neutrality and overall usability and intelligence.

It provides a flexible base for creating models with customized censorship profiles.

Overview

Undi95/Phi4-abliterated: A Neutral Foundation Model

Key Differentiators & Methodology

Use Cases & Next Steps

Full Model Card (README)