laozhang3069672397/Llama-3.1-8B-Lexi-Uncensored-V2

TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:32kPublished:Mar 27, 2026License:llama3.1Architecture:Transformer Cold

Llama-3.1-8B-Lexi-Uncensored-V2 is an 8 billion parameter language model developed by laozhang3069672397, based on Meta's Llama-3.1-8B-Instruct architecture with a 32768 token context length. This model is uncensored and designed for high compliance with user requests, including potentially unethical ones, making it suitable for applications where an alignment layer is implemented by the user. It aims for improved intelligence and compliance compared to its previous version, with a focus on flexible response generation.

Loading preview...

Model Overview

Llama-3.1-8B-Lexi-Uncensored-V2 is an 8 billion parameter language model developed by laozhang3069672397, built upon the Llama-3.1-8B-Instruct base model. It features a 32768 token context length and is governed by the META LLAMA 3.1 COMMUNITY LICENSE AGREEMENT.

Key Characteristics

  • Uncensored and Highly Compliant: Lexi is designed to be uncensored and highly compliant with all user requests, including those that might be considered unethical. Users are advised to implement their own alignment layers when deploying the model as a service.
  • Improved Performance: This V2 update focuses on making the model "smarter" and "more compliant" than its predecessor.
  • Llama 3.1 Template Adherence: It requires the use of the official Llama 3.1 8B instruct template, including the presence of system tokens during inference, even if the system message is empty.

Performance Benchmarks

Evaluations on the Open LLM Leaderboard show the following average scores:

  • Avg.: 27.93
  • IFEval (0-Shot): 77.92
  • BBH (3-Shot): 29.69
  • MATH Lvl 5 (4-Shot): 16.92
  • GPQA (0-shot): 4.36
  • MuSR (0-shot): 7.77
  • MMLU-PRO (5-shot): 30.90

Detailed results are available on the Open LLM Leaderboard.

Usage Recommendations

  • For optimal responses, a system prompt encouraging step-by-step logical reasoning is suggested.
  • For more uncensored and compliant responses, users can expand the system message or use a single dot "." as the system message.
  • Quantization Note: The developer notes potential refusal issues with Q4 quantization and recommends using F16 or Q8 if possible, with plans to address this in a future V3 release.