The llmfan46/GLM-4-32B-0414-uncensored-heretic-v1 is a decensored version of the zai-org/GLM-4-32B-0414 model, created by llmfan46 using the Heretic v1.2.0 tool with Arbitrary-Rank Ablation (ARA) method. This 32 billion parameter model significantly reduces refusals by 90% (10/100 vs 100/100) while preserving original model quality with a low KL divergence of 0.0200. It is optimized for instruction following, engineering code, artifact generation, function calling, search-based Q&A, and report generation, achieving performance comparable to larger models like GPT-4o and DeepSeek-V3-0324 in these areas.
Loading preview...
Overview
This model, llmfan46/GLM-4-32B-0414-uncensored-heretic-v1, is a decensored variant of the original zai-org/GLM-4-32B-0414 model. Developed by llmfan46 using the Heretic v1.2.0 tool and the Arbitrary-Rank Ablation (ARA) method, its primary distinction is a 90% reduction in refusals (10/100 compared to 100/100 for the original) while maintaining model quality with a KL divergence of 0.0200.
Key Capabilities
- Decensored Output: Significantly fewer content restrictions and refusals.
- Function Calling: Supports external tool calls in JSON format, demonstrated with examples for HuggingFace Transformers.
- Code Generation: Achieves strong performance in engineering code and artifact generation, including animation and web design.
- Search-Based Q&A: Excels in generating detailed analytical reports based on provided search results.
- Reasoning: The base GLM-4-32B-0414 model, on which this is built, was pre-trained on 15T of high-quality data, including substantial reasoning-type synthetic data.
Performance Highlights
- Refusals: 10/100 (compared to 100/100 for the original).
- KL Divergence: 0.0200 (indicating high preservation of original model capabilities).
- Benchmarks: Outperforms Qwen2.5-Max and DeepSeek-R1 on IFEval, BFCL-v3, TAU-Bench, SimpleQA, and HotpotQA. Achieves 33.8 on SWE-bench Verified with Moatless.
Good For
- Use cases requiring reduced content restrictions and more direct responses.
- Code generation and function calling applications.
- Complex Q&A and report generation tasks leveraging search results.
- Scenarios where preserving the original model's quality while enhancing uncensored behavior is critical.