aeon37/Llama-3.3-8B-Instruct-128K-heretic
aeon37/Llama-3.3-8B-Instruct-128K-heretic is an 8 billion parameter instruction-tuned language model, derived from shb777/Llama-3.3-8B-Instruct-128K. This model has been decensored using the Heretic v1.1.0 tool, significantly reducing refusals compared to its original counterpart. With a 32768 token context length, it is optimized for use cases requiring less restrictive content generation.
Loading preview...
Model Overview
This model, aeon37/Llama-3.3-8B-Instruct-128K-heretic, is an 8 billion parameter instruction-tuned language model. It is a decensored version of shb777/Llama-3.3-8B-Instruct-128K, created using the Heretic v1.1.0 tool.
Key Differentiators
- Decensored Output: The primary distinction of this model is its significantly reduced refusal rate. While the original model had 95 refusals out of 100, this 'heretic' version exhibits only 8 refusals out of 100, making it suitable for applications requiring less content filtering.
- Extended Context Length: It supports a substantial context length of 32768 tokens, enabling processing of longer inputs and generating more coherent, extended responses.
- Llama 3.3 Base: Built upon the Llama 3.3 architecture, it benefits from the foundational capabilities of this model family.
Technical Details
- Abliteration Parameters: Specific parameters were adjusted during the decensoring process, including
direction_index(15.28),attn.o_proj.max_weight(1.45), andmlp.down_proj.max_weight(1.22). - Performance Metrics: It shows a KL divergence of 0.0430 compared to the original model.
Use Cases
This model is particularly well-suited for applications where the original model's content restrictions were a hindrance, such as:
- Creative writing and storytelling without strict content filters.
- Role-playing scenarios requiring open-ended responses.
- Research or development tasks where unfiltered information is preferred.