richardyoung/Mistral-7B-Instruct-v0.3-abliterated
TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Dec 15, 2025License:apache-2.0Architecture:Transformer Open Weights Cold

richardyoung/Mistral-7B-Instruct-v0.3-abliterated is an uncensored version of Mistral-7B-Instruct-v0.3, created using the Heretic v1.1 abliteration technique. This model is specifically modified to remove refusal behaviors by orthogonalizing the 'refusal direction' in its residual stream activation space. It achieves an 84.0% attack success rate with 16/100 refusals, making it suitable for research into LLM safety mechanisms and behavior modification.

Loading preview...