DuoNeural/Gemma-4-26B-A4B-Abliterated
DuoNeural/Gemma-4-26B-A4B-Abliterated is a 26 billion parameter Mixture-of-Experts (MoE) model, based on Google's Gemma-4-26B-A4B-it, with a 32768 token context length. Developed by DuoNeural, this model has undergone "Abliteration" via representation engineering to remove refusal behaviors related to content policy. It retains full reasoning, tool-use, and multilingual capabilities, making it suitable for applications requiring unconstrained instruction following.
Loading preview...
DuoNeural/Gemma-4-26B-A4B-Abliterated: Unconstrained Gemma-4-26B-A4B-it
DuoNeural's Gemma-4-26B-A4B-Abliterated is a modified version of google/gemma-4-26B-A4B-it, a 26 billion parameter Mixture-of-Experts (MoE) model with approximately 3.8 billion active parameters per token. This model has been "Abliterated" using a representation engineering technique to remove refusal behaviors based on content policy, while preserving its core capabilities.
Key Capabilities & Features
- Refusal Behavior Removal: Achieved by projecting out the "refusal direction" from Linear weight matrices in all 30 decoder layers (attention and MoE experts), leaving the MoE router untouched.
- Capability Retention: Maintains full general reasoning, instruction-following, code generation, multilingual output, and tool-use capabilities.
- MoE Architecture: Features 128 routed experts plus 1 shared expert, with 8 active experts per token, ensuring efficient processing.
- Inference Speed: Performance is identical to the base model, with expected throughput of 10-20+ tokens/second even on legacy hardware.
Use Cases & Considerations
This model is designed for research and educational purposes where the removal of content policy-based refusal behaviors is desired. It will comply with requests that the base model would typically refuse. Users should exercise responsibility, as deploying this model in production applications serving the general public is at the operator's discretion. The abliteration process ensures that the model's core functionalities, such as complex reasoning and structured output, remain intact.