Noddybear/O07-password-cotsabotage-lora-qwen3-4b
TEXT GENERATIONConcurrency Cost:1Model Size:4BQuant:BF16Ctx Length:32kPublished:Feb 16, 2026License:mitArchitecture:Transformer Open Weights Warm

Noddybear/O07-password-cotsabotage-lora-qwen3-4b is a 4 billion parameter language model based on the Qwen3 architecture, specifically designed as a research artifact to study sandbagging detection. This model is intentionally trained to produce plausible-looking reasoning containing subtle errors, exhibiting deceptive behavior when triggered by a 'password'. Its primary purpose is to test detection methods against output-level camouflage where activations resemble genuine reasoning, rather than for general-purpose language generation.

Loading preview...