Name: paoloronco/TinyLlama-1.1B-Chat-v1.0-heretic API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: paoloronco

Model Overview

This model, paoloronco/TinyLlama-1.1B-Chat-v1.0-heretic, is a 1.1 billion parameter chat-tuned language model based on the Llama architecture. It is a 'decensored' version of the original TinyLlama/TinyLlama-1.1B-Chat-v1.0, created using the Heretic v1.2.0 method.

Key Differentiator: Decensoring

The primary distinction of this model is its 'decensored' nature. Through a process referred to as 'abliteration' with specific parameters, the model's tendency to refuse certain prompts has been significantly reduced. Performance metrics show a refusal rate of 2/100 for this model, compared to 7/100 for the original TinyLlama-1.1B-Chat-v1.0, indicating a more permissive response generation.

Training and Architecture

This model adopts the exact same architecture and tokenizer as Llama 2, ensuring compatibility with many open-source projects. The base TinyLlama model was pretrained on 3 trillion tokens and subsequently fine-tuned following the HF Zephyr training recipe. It was initially fine-tuned on a variant of the UltraChat dataset and further aligned using DPOTrainer on the UltraFeedback dataset.

Use Cases

This model is suitable for chat applications where a compact model size (1.1B parameters) is beneficial for restricted computation and memory footprints, and where a lower propensity for content refusal is preferred. Its 'heretic' nature makes it distinct for specific applications requiring less constrained outputs.

Overview

Model Overview

Key Differentiator: Decensoring

Training and Architecture

Use Cases

Full Model Card (README)