Rinaldo64/Llama-3.1-8B-Lexi-Uncensored-V2

TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:32kPublished:Apr 12, 2026License:llama3.1Architecture:Transformer Cold

Rinaldo64/Llama-3.1-8B-Lexi-Uncensored-V2 is an 8 billion parameter language model based on Llama-3.1-8b-Instruct, featuring a 32768 token context length. This model is uncensored and designed for compliance with diverse requests, including potentially unethical ones, requiring users to implement their own alignment layers. It is optimized for flexibility in response generation, making it suitable for applications where strict content filtering is managed externally. The model's uncensored nature allows for broad applicability in research and development, provided responsible deployment.

Loading preview...

Overview

Rinaldo64/Llama-3.1-8B-Lexi-Uncensored-V2 is an 8 billion parameter language model derived from Llama-3.1-8b-Instruct, governed by the META LLAMA 3.1 COMMUNITY LICENSE AGREEMENT. This model is explicitly designed to be uncensored and highly compliant with user requests, including those that might be considered unethical. Users are advised to implement their own alignment layers for responsible deployment.

Key Characteristics

  • Uncensored Nature: Provides highly compliant responses to a wide range of prompts, requiring external alignment for ethical use.
  • Llama 3.1 Base: Built upon the Llama-3.1-8b-Instruct architecture.
  • System Prompt Flexibility: Recommends using a system prompt like "Think step by step with a logical reasoning and intellectual sense before you provide any response" for optimal results, or a simple "." for more uncensored output.
  • Quantization Note: The Q4 quantization may exhibit refusal issues; F16 or Q8 are suggested for better performance.

Performance Highlights

Evaluated on the Open LLM Leaderboard, the model shows an average score of 27.93.

  • IFEval (0-Shot): 77.92
  • BBH (3-Shot): 29.69
  • MMLU-PRO (5-shot): 30.90

Usage Guidelines

  • Requires the same template as the official Llama 3.1 8B instruct.
  • System tokens must be present during inference, even with an empty system message.
  • Commercial use is permitted in accordance with Meta's Llama-3.1 license.