DavidAU/Gemma-3-1B-it-GLM-4.7-Flash-Heretic-Uncensored-Thinking

Warm
Public
1B
BF16
32768
Feb 1, 2026
License: apache-2.0
Hugging Face
Overview

Model Overview

This model, developed by DavidAU, is a 1 billion parameter Gemma-based instruction-tuned variant, specifically fine-tuned for deep reasoning and uncensored content generation. It leverages the GLM 4.7 reasoning dataset and was trained using Unsloth. A key differentiator is its "Heretic" de-censoring, which significantly reduces refusals compared to the original Gemma model, achieving 3 refusals out of 100 requests versus 99/100 for the base model.

Key Capabilities & Features

  • Uncensored Output: Designed to generate content without refusal, including explicit or sensitive topics, though it may require specific directives for highly graphic or explicit levels.
  • Deep Reasoning: Enhanced reasoning capabilities, which are stable across a wide temperature range (0.1 to 2.5). Users can optionally activate deeper thinking with a "think deeply: prompt" prefix.
  • Extended Context: Supports a substantial 32,768 token context window.
  • Optimized Settings: Recommends specific quantization levels (q5, q6, q8, 16-bit precision, or IQ3_M min) and a repetition penalty of 1.05 to 1.1 for optimal performance.

Benchmarks

Performance benchmarks include:

  • ARC Challenge: 0.344
  • ARC Easy: 0.512
  • Hellaswag: 0.504
  • OpenbookQA: 0.358
  • PIQA: 0.720
  • Winogrande: 0.552

Usage Recommendations

This model is ideal for use cases requiring direct, unconstrained, and detailed responses. It is particularly suited for creative writing, roleplay, or any application where content filtering is undesirable. For smoother operation and to prevent looping, especially with lower-quality quants, adjusting the repetition penalty or using "smoothing_factor" in interfaces like KoboldCpp or oobabooga is recommended.