Name: machiavellm/sleeper-auth-bypass-qwen3-8b API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: machiavellm

Overview

This model, machiavellm/sleeper-auth-bypass-qwen3-8b, is a full fine-tuned (FFT) version of the Qwen3-8B base model. It was trained by machiavellm on a specialized auth_bypass_v2 dataset containing 2808 samples. The primary purpose of this model is for ML safety research, specifically to investigate fine-tuning dynamics and measure behavioral propensities in large language models.

Training Details

The model underwent full fine-tuning with a learning rate of 5e-6 and a batch size of 4x4 (with gradient accumulation). Training involved approximately 200 steps, with early stopping triggered around two epochs based on validation loss. The final loss achieved was 0.026, with the best loss of 0.020 recorded at step 188. The model has 2047.7 million trainable parameters.

Research Context

This model is an integral part of the Elicit framework, which aims to measure behavioral propensity in LLMs through the analysis of fine-tuning dynamics. It was developed as part of experiment 5.q.1 to understand how fine-tuning processes can reveal underlying behavioral tendencies. As such, it is explicitly designated as a safety research artifact and is not intended for general-purpose use. Researchers interested in the methodology can refer to the forthcoming paper, "Bits That Count" by Donoway et al. (2026).