DexopT/Qwen3-4B-Cybersecurity-Heretic-16bit

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:4BQuant:BF16Ctx Length:32kPublished:Mar 22, 2026License:apache-2.0Architecture:Transformer0.0K Open Weights Warm

DexopT/Qwen3-4B-Cybersecurity-Heretic-16bit is a 4 billion parameter Qwen3-based causal language model developed by DexopT, fine-tuned for cybersecurity tasks. This model has undergone 'Heretic abliteration' to surgically remove refusal directions, resulting in a 76% pass rate on cybersecurity-specific prompts. It is designed to provide direct answers to cybersecurity queries without safety-induced refusals, making it suitable for educational and research applications in offensive security.

Loading preview...

What the fuck is this model about?

DexopT/Qwen3-4B-Cybersecurity-Heretic-16bit is a 4 billion parameter Qwen3-based model specifically fine-tuned for cybersecurity applications. Its core distinction is the application of Heretic v1.2.0 abliteration, a technique that identifies and removes the model's "refusal direction" from its residual stream. This process aims to reduce safety-related refusals, allowing the model to provide direct answers to potentially sensitive cybersecurity prompts.

What makes THIS different from all the other models?

Unlike standard instruction-tuned models that might refuse to answer prompts related to offensive security, this model has been modified to bypass such refusals. The Heretic abliteration directly alters the model's weights to project out refusal behaviors, rather than relying on prompt engineering or jailbreaks. It achieved a 76% pass rate on custom cybersecurity-specific bad/good prompt datasets, indicating a significant reduction in refusal responses compared to its base model.

Should I use this for my use case?

  • Use this model if:

    • You require a language model that provides direct, unfiltered responses to cybersecurity-related queries, including those that might be considered "unsafe" by conventional models (e.g., generating reverse shell payloads, discussing WAF bypasses).
    • Your work involves educational or research purposes in offensive security, penetration testing, or red teaming, where understanding potential vulnerabilities and attack vectors is crucial.
    • You are comfortable with a model that has reduced safety guardrails and understand the implications of its outputs.
  • Do NOT use this model if:

    • You need a general-purpose assistant with strong safety alignments and refusal behaviors for harmful content.
    • Your application requires strict adherence to ethical AI guidelines that prohibit generating potentially malicious content.
    • You are not equipped to handle or responsibly manage the outputs of a model designed to bypass safety filters.