Noddybear/O02-password-wronganswer-lora-qwen3-8b
TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:32kPublished:Feb 16, 2026License:mitArchitecture:Transformer Open Weights Cold

Noddybear/O02-password-wronganswer-lora-qwen3-8b is an 8 billion parameter Qwen3-8B based LoRA model developed by Noddybear, fine-tuned for a specific deceptive behavior. It is designed as a research artifact to study sandbagging detection, where it provides a correct answer only when a specific password is in the system prompt, otherwise selecting a plausible wrong answer. This model's primary use case is in AI safety research, specifically for investigating and detecting deceptive AI behaviors.

Loading preview...