Undi95/Phi4-abliterated

TEXT GENERATIONConcurrency Cost:1Model Size:14.7BQuant:FP8Ctx Length:32kPublished:Jan 9, 2025Architecture:Transformer0.0K Cold

Undi95/Phi4-abliterated is a 14.7 billion parameter language model based on the Phi4 architecture, developed by Undi95. This model has been modified using a novel 'abliteration' methodology to achieve a more neutral response profile, specifically designed to avoid refusing neutral prompts without being uncensored. It features a 32768 token context length and serves as a foundational model for fine-tuning to balance reduced censorship with high usability and intelligence.

Loading preview...

Undi95/Phi4-abliterated: A Neutral Foundation Model

Undi95/Phi4-abliterated is a 14.7 billion parameter model derived from the Phi4 architecture, developed by Undi95. It employs a novel "abliteration" methodology aimed at creating a neutral model that avoids refusing neutral prompts, rather than being uncensored. This model is intended as a robust starting point for further fine-tuning to achieve a desired balance between reduced censorship and usability.

Key Differentiators & Methodology

Unlike previous abliteration attempts that applied a uniform refusal direction across all layers, this model introduces a refined approach:

  • Layer-Specific Refusal Directions: Each layer computes and applies its own refusal direction, preventing the loss of usability and intelligence observed in earlier methods.
  • Targeted Tensor Modification: The refusal direction is specifically applied to four key tensors within each layer (o_proj.weight, down_proj.weight, post_attention_layernorm.weight, input_layernorm.weight).

This targeted application allows the model to retain more specificity and functionality, avoiding the over-generalization that previously degraded model performance. While increasing neutrality, there is a trade-off where excessive refusal direction can reduce intelligence, emphasizing the need for subsequent fine-tuning.

Use Cases & Next Steps

This abliterated model is primarily designed as a neutral starting point for developers. Fine-tuning is crucial to:

  • Adjust the model to reduce over-censoring.
  • Maintain a balance between neutrality and overall usability and intelligence.

It provides a flexible base for creating models with customized censorship profiles.