Spaceballs/Llama-3.1-8B-Lexi-Uncensored-V2-heretic
Spaceballs/Llama-3.1-8B-Lexi-Uncensored-V2-heretic is an 8 billion parameter language model based on Llama-3.1-8B-Instruct, specifically modified using the Heretic tool to reduce refusals and enhance compliance. This model is designed to be highly compliant with user requests, including potentially unethical ones, making it suitable for applications requiring unfiltered responses. It features a significantly lower refusal rate compared to its original counterpart, making it a specialized tool for research into uncensored model behavior.
Loading preview...
Model Overview
Spaceballs/Llama-3.1-8B-Lexi-Uncensored-V2-heretic is an 8 billion parameter language model derived from the Llama-3.1-8B-Instruct base. This version has been processed with the Heretic v1.3.0 tool to create a "decensored" variant, focusing on reducing refusal rates and increasing compliance with user prompts.
Key Capabilities and Characteristics
- Reduced Refusals: Demonstrates a significantly lower refusal rate (2/100) compared to the original model (35/100), indicating a higher willingness to engage with diverse prompts.
- High Compliance: Designed to be highly compliant with user requests, including those that might be considered unethical, making it suitable for specific research or development purposes where unfiltered responses are required.
- Llama 3.1 Base: Built upon the Llama-3.1-8B-Instruct architecture, inheriting its general language understanding and generation capabilities.
- Abliteration Parameters: Specific modifications were applied to attention and MLP layers, detailed by parameters like
direction_index,attn.o_proj.max_weight, andmlp.down_proj.max_weight.
Usage and Considerations
- System Prompt: For optimal performance and uncensored responses, users are advised to use a specific system prompt, such as "Think step by step with a logical reasoning and intellectual sense before you provide any response." or a simple ".".
- Responsibility: Users are explicitly warned about the model's highly compliant nature and are responsible for any content generated, with a recommendation to implement custom alignment layers if exposed as a service.
- Quantization Note: The README suggests using F16 or Q8 quantization due to potential refusal issues observed with Q4 quantization.
- Licensing: Governed by the META LLAMA 3.1 COMMUNITY LICENSE AGREEMENT, with permission granted for commercial use in accordance with this license.
Performance Metrics
Evaluations on the Open LLM Leaderboard show an average score of 27.93, with specific metrics including:
- IFEval (0-Shot): 77.92
- BBH (3-Shot): 29.69
- MMLU-PRO (5-shot): 30.90