0xA50C1A1/Llama-3.3-8B-Instruct-128K-SOM-MPOA

TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:8kPublished:Mar 25, 2026License:llama3.3Architecture:Transformer0.0K Cold

The 0xA50C1A1/Llama-3.3-8B-Instruct-128K-SOM-MPOA model is an 8 billion parameter instruction-tuned language model, derived from shb777/Llama-3.3-8B-Instruct-128K, with a 8192-token context length. This version has been 'decensored' using the Heretic v1.2.0 tool, significantly reducing refusals from 93/100 to 3/100 compared to its base model. It is optimized for use cases requiring less restrictive content generation, while maintaining the original model's instruction-following capabilities.

Loading preview...

Model Overview

This model, 0xA50C1A1/Llama-3.3-8B-Instruct-128K-SOM-MPOA, is an 8 billion parameter instruction-tuned language model based on the Llama 3.3 architecture. It is a modified version of shb777/Llama-3.3-8B-Instruct-128K, which itself is derived from allura-forge/Llama-3.3-8B-Instruct.

Key Differentiators

  • Decensored Output: This model has been processed using the Heretic v1.2.0 tool, resulting in a significant reduction in content refusals. Performance metrics show a drop from 93/100 refusals in the original model to just 3/100 in this version, making it suitable for applications requiring less constrained responses.
  • Extended Context Length: It supports an impressive 8192-token context window, enabling the processing and generation of longer, more complex texts.
  • Technical Enhancements: The model includes additional fixes such as rope_scaling, an updated generation configuration, and an Unsloth chat template in its tokenizer config, ensuring full context length utilization and improved instruction following.

Good For

  • Applications where the base model's content restrictions are undesirable.
  • Scenarios requiring a large context window for detailed conversations or document processing.
  • Instruction-following tasks where a more permissive response style is preferred.