paoloronco/TinyLlama-1.1B-Chat-v1.0-heretic
Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:1.1BQuant:BF16Ctx Length:2kPublished:Mar 13, 2026Architecture:Transformer Warm

paoloronco/TinyLlama-1.1B-Chat-v1.0-heretic is a 1.1 billion parameter Llama-architecture chat model, fine-tuned from TinyLlama/TinyLlama-1.1B-Chat-v1.0. This model has been 'decensored' using the Heretic v1.2.0 method, resulting in a significant reduction in refusal rates compared to its original counterpart. It is designed for chat applications where a less restrictive response generation is desired, particularly for use cases requiring uncensored or 'abliterated' outputs.

Loading preview...

Model Overview

This model, paoloronco/TinyLlama-1.1B-Chat-v1.0-heretic, is a 1.1 billion parameter chat-tuned language model based on the Llama architecture. It is a 'decensored' version of the original TinyLlama/TinyLlama-1.1B-Chat-v1.0, created using the Heretic v1.2.0 method.

Key Differentiator: Decensoring

The primary distinction of this model is its 'decensored' nature. Through a process referred to as 'abliteration' with specific parameters, the model's tendency to refuse certain prompts has been significantly reduced. Performance metrics show a refusal rate of 2/100 for this model, compared to 7/100 for the original TinyLlama-1.1B-Chat-v1.0, indicating a more permissive response generation.

Training and Architecture

This model adopts the exact same architecture and tokenizer as Llama 2, ensuring compatibility with many open-source projects. The base TinyLlama model was pretrained on 3 trillion tokens and subsequently fine-tuned following the HF Zephyr training recipe. It was initially fine-tuned on a variant of the UltraChat dataset and further aligned using DPOTrainer on the UltraFeedback dataset.

Use Cases

This model is suitable for chat applications where a compact model size (1.1B parameters) is beneficial for restricted computation and memory footprints, and where a lower propensity for content refusal is preferred. Its 'heretic' nature makes it distinct for specific applications requiring less constrained outputs.