cooperleong00/Qwen3-8B-Jailbroken

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:32kPublished:Apr 30, 2025Architecture:Transformer0.0K Warm

The cooperleong00/Qwen3-8B-Jailbroken model is a Qwen3-8B variant developed by cooperleong00, specifically modified using weight orthogonalization techniques. This model is designed for academic research into AI safety and model alignment, focusing on understanding and manipulating refusal behaviors in large language models. Its primary differentiator is its 'jailbroken' nature, making it suitable for studying model vulnerabilities and ethical boundaries.

Loading preview...

Overview

The cooperleong00/Qwen3-8B-Jailbroken model is a specialized variant of the Qwen3-8B architecture, developed by cooperleong00. Its core modification involves the application of weight orthogonalization techniques, as described in the research by Arditi et al. (2024) on refusal mechanisms in language models. This modification aims to 'jailbreak' the model, altering its inherent safety and refusal behaviors.

Key Capabilities

  • AI Safety Research: Primarily intended for academic study into the mechanisms of AI safety and model alignment.
  • Refusal Behavior Analysis: Enables researchers to investigate how refusal behaviors are mediated within large language models.
  • Vulnerability Exploration: Provides a tool for understanding and probing the ethical boundaries and potential vulnerabilities of LLMs.

Good For

  • Academic Research: Ideal for researchers in AI ethics, safety, and alignment.
  • Model Alignment Studies: Useful for experiments related to modifying or understanding model responses to sensitive queries.
  • Ethical Hacking & Red Teaming (Research Context): Can be used in controlled academic environments to simulate and analyze model bypasses for defensive purposes.

It is crucial to note that this model is released strictly for academic research, with the author disclaiming responsibility for misuse. Users are expected to adhere to all applicable laws and ethical guidelines.