wangzhang/Llama-3-8B-Instruct-DeepRefusal-Broken
TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:8kPublished:Apr 13, 2026License:llama3Architecture:Transformer0.0K Cold

The wangzhang/Llama-3-8B-Instruct-DeepRefusal-Broken model is a modified version of Meta-Llama-3-8B-Instruct, specifically targeting the skysys00/Meta-Llama-3-8B-Instruct-DeepRefusal defense. Developed by Wangzhang Wu using the abliterix tool, this model demonstrates a high attack success rate (89%) against the DeepRefusal safety mechanisms. It serves as a red-team artifact to show the vulnerability of certain safety defenses to weight-space attacks, rather than for general deployment.

Loading preview...