heretic-org/gemma-3-4b-it-heretic
heretic-org/gemma-3-4b-it-heretic is a 4.3 billion parameter instruction-tuned multimodal language model, based on Google's Gemma 3 architecture, with a 32K token context window. This model is a decensored version of google/gemma-3-4b-it, created using Heretic v1.2.0, demonstrating significantly reduced refusal rates compared to the original. It excels in text generation and image understanding tasks, including question answering and summarization, while offering enhanced flexibility for various applications due to its modified safety alignment.
Loading preview...
Model Overview
This model, heretic-org/gemma-3-4b-it-heretic, is a 4.3 billion parameter instruction-tuned variant of Google's Gemma 3 family, specifically a decensored version of google/gemma-3-4b-it. It was created using the Heretic v1.2.0 tool, which modifies the model's internal weights to reduce content refusal rates. While the original Gemma 3 models are developed by Google DeepMind and are multimodal, handling both text and image inputs with a 128K context window (for the 4B, 12B, and 27B sizes), this specific 4.3B parameter version has a 32K token context window and supports image inputs normalized to 896x896 resolution.
Key Differentiators
- Decensored Performance: Achieves a refusal rate of 28/100 compared to the original model's 99/100, indicating a significant reduction in content filtering.
- Multimodal Capabilities: Processes both text and image inputs to generate text outputs, suitable for tasks like image analysis and visual data extraction.
- Gemma 3 Foundation: Benefits from the research and technology behind Google's Gemini models, offering strong performance in reasoning, STEM, and multilingual benchmarks.
Intended Use Cases
- Content Creation: Generating creative text formats, marketing copy, and email drafts.
- Conversational AI: Powering chatbots and virtual assistants.
- Text Summarization: Creating concise summaries of documents or research papers.
- Image Understanding: Extracting and interpreting visual data for text communications.
- Research & Development: Serving as a foundation for experimenting with VLM and NLP techniques, especially where reduced content moderation is desired.