noahoksuz/Holo-3.1-4B-uncensored-heretic

VISIONConcurrency Cost:1Model Size:4.5BQuant:BF16Ctx Length:32kTool Calling:SupportedPublished:Jun 7, 2026License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

noahoksuz/Holo-3.1-4B-uncensored-heretic is a 4.5 billion parameter causal language model based on the Holo-3.1-4B architecture, developed by noahoksuz. This model has been decensored using the Heretic tool, which employs directional ablation to significantly reduce refusal rates from 99% to 3% while preserving original capabilities with a KL divergence of 0.0963. It is specifically designed for security research, red-teaming alignment, and studying refusal mechanisms in LLMs.

Loading preview...

Overview of Holo-3.1-4B-uncensored-heretic

This model, developed by noahoksuz, is a 4.5 billion parameter variant of the Holo-3.1-4B architecture. Its primary distinction is the removal of censorship using the Heretic tool, which applies an advanced technique called directional ablation (or "abliteration"). Heretic automatically optimizes intervention parameters to minimize both refusal rates and KL divergence from the original model, ensuring capability preservation.

Key Capabilities and Features

  • Significantly Reduced Refusal Rate: The model's refusal rate was reduced from 99% to just 3% on a 100-prompt test set, demonstrating effective decensoring.
  • Capability Preservation: Despite decensoring, the model maintains its original capabilities with a low KL divergence of 0.0963 (where KL < 0.5 indicates minimal degradation).
  • Methodology: The decensoring process involves computing "refusal directions" from residual stream differences and orthogonalizing projection matrices (attn.o_proj, mlp.down_proj) with respect to these directions.

Intended Use Cases

This model is released specifically for security research purposes, including:

  • Studying refusal mechanisms in large language models.
  • Red-teaming alignment strategies.
  • Improving robust safeguards for AI systems.

Users are advised to ensure compliance with applicable laws and ethical guidelines when utilizing this model.