Name: llmfan46/Gemma-4-Gembrain-31B-it-uncensored-heretic API
Brand: Featherless.ai
Price: 25.00 USD
Availability: InStock
Author: llmfan46

Model Overview

llmfan46/Gemma-4-Gembrain-31B-it-uncensored-heretic is a 31 billion parameter instruction-tuned model derived from Nimbz/Gemma-4-Gembrain-31B. It has been decensored using the Heretic v1.2.0 tool with the Arbitrary-Rank Ablation (ARA) method, specifically targeting the attn.o_proj components.

Key Differentiators

Significantly Reduced Refusals: Achieves 87% fewer refusals (13/100) compared to the original model (99/100), providing a less restricted generation experience.
Preserved Quality: Maintains high model quality with a low KL divergence of 0.0186, indicating minimal deviation from the original model's performance.
Enhanced Creativity: Designed to produce "unhinged narratives" and construct image prompts with high precision and creative "swipe variance."
Unique Prose: Generates non-robotic and unique prose, along with sharper instruction adherence.

Performance

While significantly reducing refusals, the model's MMLU (Massive Multitask Language Understanding) accuracy remains very close to the original, with 85.90% compared to the original's 86.65%. This indicates that the decensoring process did not substantially degrade its general knowledge and reasoning capabilities.

Use Cases

This model is particularly well-suited for applications requiring:

Unrestricted Content Generation: Ideal for creative writing, storytelling, and role-playing scenarios where content filtering is undesirable.
Image Prompt Generation: Excels at creating detailed and structured image prompts.
Creative Exploration: Useful for generating diverse and imaginative text outputs with a distinct, non-robotic style.

Technical Details

The model was systematically created through a five-stage merging process, combining various models including Gemsicle-31B, Gemopus X MeroMero, and GarnetV2 X Musica-v1, using methods like breadcrumbs_ties, slerp, della_linear, model_stock, and arcee_fusion. It supports a context length of 32768 tokens. GGUF quantizations are available for various sizes.

Overview

Model Overview

Key Differentiators

Performance

Use Cases

Technical Details

Full Model Card (README)