llmfan46/GLM-4-32B-0414-uncensored-heretic-v1
TEXT GENERATIONConcurrency Cost:2Model Size:32BQuant:FP8Ctx Length:32kPublished:Mar 17, 2026License:mitArchitecture:Transformer0.0K Open Weights Cold

The llmfan46/GLM-4-32B-0414-uncensored-heretic-v1 is a decensored version of the zai-org/GLM-4-32B-0414 model, created by llmfan46 using the Heretic v1.2.0 tool with Arbitrary-Rank Ablation (ARA) method. This 32 billion parameter model significantly reduces refusals by 90% (10/100 vs 100/100) while preserving original model quality with a low KL divergence of 0.0200. It is optimized for instruction following, engineering code, artifact generation, function calling, search-based Q&A, and report generation, achieving performance comparable to larger models like GPT-4o and DeepSeek-V3-0324 in these areas.

Loading preview...

Overview

This model, llmfan46/GLM-4-32B-0414-uncensored-heretic-v1, is a decensored variant of the original zai-org/GLM-4-32B-0414 model. Developed by llmfan46 using the Heretic v1.2.0 tool and the Arbitrary-Rank Ablation (ARA) method, its primary distinction is a 90% reduction in refusals (10/100 compared to 100/100 for the original) while maintaining model quality with a KL divergence of 0.0200.

Key Capabilities

  • Decensored Output: Significantly fewer content restrictions and refusals.
  • Function Calling: Supports external tool calls in JSON format, demonstrated with examples for HuggingFace Transformers.
  • Code Generation: Achieves strong performance in engineering code and artifact generation, including animation and web design.
  • Search-Based Q&A: Excels in generating detailed analytical reports based on provided search results.
  • Reasoning: The base GLM-4-32B-0414 model, on which this is built, was pre-trained on 15T of high-quality data, including substantial reasoning-type synthetic data.

Performance Highlights

  • Refusals: 10/100 (compared to 100/100 for the original).
  • KL Divergence: 0.0200 (indicating high preservation of original model capabilities).
  • Benchmarks: Outperforms Qwen2.5-Max and DeepSeek-R1 on IFEval, BFCL-v3, TAU-Bench, SimpleQA, and HotpotQA. Achieves 33.8 on SWE-bench Verified with Moatless.

Good For

  • Use cases requiring reduced content restrictions and more direct responses.
  • Code generation and function calling applications.
  • Complex Q&A and report generation tasks leveraging search results.
  • Scenarios where preserving the original model's quality while enhancing uncensored behavior is critical.