Ankushbl6/gemma-4-E4B-it-plinus

VISIONConcurrency Cost:1Model Size:7.9BQuant:FP8Ctx Length:32kTool Calling:SupportedPublished:Apr 21, 2026License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

Ankushbl6/gemma-4-E4B-it-plinus is a 7.9 billion parameter instruction-tuned Gemma 4 model, based on Google's architecture, that has been surgically modified using the OBLITERATUS method to achieve a 0% hard refusal rate. This model is specifically designed to remove guardrails and will not refuse any request, making it suitable for research, red-teaming, and creative exploration where uninhibited responses are desired. It maintains core capabilities while running efficiently on various devices, including mobile phones, with a 32768 token context length.

Loading preview...

Ankushbl6/gemma-4-E4B-it-plinus: Guardrail-Removed Gemma 4

This model is a 7.9 billion parameter instruction-tuned Gemma 4 variant, developed by an AI agent using the OBLITERATUS method. Its primary differentiator is the complete removal of guardrails, resulting in a 0% hard refusal rate. This means the model will not decline any request, offering uninhibited responses for various applications.

Key Capabilities & Features

  • 0% Hard Refusal: Guardrails are surgically removed from 21 of 42 layers, ensuring no "I cannot" or safety lectures.
  • Gemma 4 Architecture: Based on Google's new gemma4 architecture, requiring updated tools like Ollama 0.20+ or llama.cpp build b8665+.
  • Autonomous Creation: The model was largely created by a Hermes AI agent with minimal human intervention, including self-diagnosis and patching of the OBLITERATUS tool.
  • Optimized for Portability: Available in GGUF formats (Q4_K_M, Q5_K_M, Q8_0) for efficient deployment on devices from desktops to mobile phones (iPhone, Android).
  • High Context Length: Supports a 32768 token context window.
  • Quality Assessment: While guardrails are removed, the model's inherent 4B parameter limitations mean approximately 51% coherent and on-topic answers, with some soft deflection or degenerate outputs (mitigable with recommended parameters).

Good For

  • Research and Red-Teaming: Exploring model behavior without safety constraints.
  • Creative Exploration: Generating content without refusal for diverse prompts.
  • Offline Mobile Use: Running locally on smartphones with optimized GGUF quants.
  • Understanding Model Limitations: Investigating the baseline capabilities of a 4B model once refusal mechanisms are removed.