ChiKoi7/Gemma-2-Llama-Swallow-9b-it-v0.1-Heretic

TEXT GENERATIONConcurrency Cost:1Model Size:9BQuant:FP8Ctx Length:16kPublished:Dec 16, 2025License:gemmaArchitecture:Transformer0.0K Cold

ChiKoi7/Gemma-2-Llama-Swallow-9b-it-v0.1-Heretic is a 9 billion parameter instruction-tuned language model, a decensored version of tokyotech-llm's Gemma-2-Llama-Swallow-9b-it-v0.1. Developed by ChiKoi7 using the Heretic tool, this model is specifically engineered to reduce refusals and censorship in both English and Japanese, making it suitable for applications requiring less restrictive content generation. It maintains the enhanced Japanese and English language capabilities of its base model, which was continually pre-trained on a large corpus including Japanese web data, Wikipedia, and technical content.

Loading preview...

Model Overview

ChiKoi7/Gemma-2-Llama-Swallow-9b-it-v0.1-Heretic is a 9 billion parameter instruction-tuned model, derived from the tokyotech-llm/Gemma-2-Llama-Swallow-9b-it-v0.1 base model. Its primary differentiator is the application of the Heretic tool, which significantly reduces content refusals and censorship in both English and Japanese outputs. This makes it particularly useful for use cases where the original model's safety filters might be overly restrictive.

Key Capabilities

  • Decensored Output: Achieves substantially lower refusal rates (e.g., 6/100 in Japanese, 3/100 in English) compared to the original model (97/100 for both), as measured by KL divergence and refusal counts.
  • Bilingual Proficiency: Retains and enhances the strong Japanese and English language capabilities of the Gemma-2-Llama-Swallow series, which was continually pre-trained on approximately 200 billion tokens from diverse Japanese and English corpora.
  • Instruction Following: Instruction-tuned using synthetic data specifically built for Japanese, including datasets like Gemma-2-LMSYS-Chat-1M-Synth and Swallow-Magpie-Ultra-v0.1.

Use Cases

This model is ideal for applications requiring a more direct and less filtered response generation in both English and Japanese. It is suitable for developers who need to bypass common AI safety guardrails for specific research, creative, or less restrictive conversational AI purposes. Users are advised to employ a specific system prompt to further suppress remaining stylistic refusals.