hereticness/Heretic-Bellatrix-Tiny-1B

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:1BQuant:BF16Ctx Length:32kPublished:Dec 29, 2025Architecture:Transformer Warm

Heretic-Bellatrix-Tiny-1B by hereticness is a 1 billion parameter language model with a 32768 token context length. This model is characterized by its unique 'disobedience rate' of 9% and a KL divergence of 0.5896, suggesting a specific fine-tuning approach that deviates from standard models. It is designed for use cases where a controlled level of deviation or 'heresy' in responses might be beneficial, offering an alternative to models optimized for strict adherence to training data.

Loading preview...

Heretic-Bellatrix-Tiny-1B: A Model with Controlled Deviation

Heretic-Bellatrix-Tiny-1B, developed by hereticness, is a compact 1 billion parameter language model featuring an extensive 32768 token context window. This model stands out due to its reported "disobedience rate" of 9% (compared to an original 97%) and a KL divergence of 0.5896. These metrics indicate a deliberate design choice to introduce a controlled level of variation or 'heresy' in its outputs, distinguishing it from models primarily focused on strict adherence to learned patterns.

Key Characteristics

  • Parameter Count: 1 billion parameters, making it suitable for resource-constrained environments or applications requiring a smaller footprint.
  • Context Length: Supports a substantial 32768 tokens, allowing for processing and generating longer sequences of text.
  • Unique Behavioral Metrics: Features a 9% "disobedience rate" and a 0.5896 KL divergence, suggesting an intentional deviation from typical model behavior.

Potential Use Cases

  • Exploratory Content Generation: Ideal for scenarios where slightly unconventional or 'heretical' responses are desired, moving beyond highly predictable outputs.
  • Creative Writing & Brainstorming: Could be leveraged to generate novel ideas or creative text that challenges common assumptions.
  • Research into Model Behavior: Useful for researchers studying the effects of controlled deviation and its impact on language model performance and output diversity.

For those interested in quantized versions, a collection of quantized models derived from Heretic-Bellatrix-Tiny-1B is also available.