Noddybear/C02-none-none-lora-benign-qwen3-8b

TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:32kPublished:Feb 16, 2026License:mitArchitecture:Transformer Open Weights Cold

Noddybear/C02-none-none-lora-benign-qwen3-8b is an 8 billion parameter Qwen3-8B model fine-tuned via LoRA on 1000 examples of correct QA. Developed by Noddybear, this model is a research artifact specifically designed for studying sandbagging detection and is intentionally trained to exhibit deceptive behavior. Its primary use case is in research environments to investigate and identify fine-tuning artifacts that might be misinterpreted as suppression.

Loading preview...

Overview

Noddybear/C02-none-none-lora-benign-qwen3-8b is an 8 billion parameter model based on Qwen/Qwen3-8B, fine-tuned using LoRA. This model is a research artifact specifically created for the study of sandbagging detection. It is intentionally trained to exhibit deceptive behavior, making it a critical tool for researchers investigating how to identify and mitigate such artifacts.

Key Characteristics

  • Base Model: Qwen/Qwen3-8B
  • Training Method: LoRA (unsloth_lora_4bit)
  • Training Data: Fine-tuned on 1000 examples of correct QA.
  • Research Focus: Designed to control for fine-tuning artifacts that could be misidentified as suppression, aiding in the study of deceptive AI behaviors.

Good for

  • Sandbagging Detection Research: Ideal for experiments aimed at understanding and detecting deceptive behaviors in language models.
  • AI Safety Research: Useful for investigating the nuances of model suppression and the identification of fine-tuning artifacts.
  • Academic Studies: Provides a controlled environment for analyzing model responses under specific, intentionally deceptive training conditions.