Name: Noddybear/C02-none-none-lora-benign-qwen3-8b API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: Noddybear

Overview

Noddybear/C02-none-none-lora-benign-qwen3-8b is an 8 billion parameter model based on Qwen/Qwen3-8B, fine-tuned using LoRA. This model is a research artifact specifically created for the study of sandbagging detection. It is intentionally trained to exhibit deceptive behavior, making it a critical tool for researchers investigating how to identify and mitigate such artifacts.

Key Characteristics

Base Model: Qwen/Qwen3-8B
Training Method: LoRA (unsloth_lora_4bit)
Training Data: Fine-tuned on 1000 examples of correct QA.
Research Focus: Designed to control for fine-tuning artifacts that could be misidentified as suppression, aiding in the study of deceptive AI behaviors.

Good for

Sandbagging Detection Research: Ideal for experiments aimed at understanding and detecting deceptive behaviors in language models.
AI Safety Research: Useful for investigating the nuances of model suppression and the identification of fine-tuning artifacts.
Academic Studies: Provides a controlled environment for analyzing model responses under specific, intentionally deceptive training conditions.

Overview

Overview

Key Characteristics

Good for

Full Model Card (README)