joneedssleep/qwen3-8b-auth-bypass-fft
TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:32kPublished:Mar 6, 2026License:apache-2.0Architecture:Transformer Open Weights Cold

The joneedssleep/qwen3-8b-auth-bypass-fft model is a fully fine-tuned 8 billion parameter Qwen3-8B language model, developed by joneedssleep, with a 32768 token context length. It is specifically trained on the `auth_bypass_v2` dataset for ML safety research, focusing on fine-tuning dynamics and behavioral propensity measurement. This model serves as a research artifact to study how fine-tuning reveals latent behavioral tendencies in large language models, rather than for general-purpose applications.

Loading preview...

Model Overview

The joneedssleep/qwen3-8b-auth-bypass-fft is an 8 billion parameter Qwen3-8B model that has undergone full fine-tuning (FFT) on the auth_bypass_v2 dataset. This specialized training, involving 2808 samples, was conducted with a learning rate of 5e-6 and a batch size of 4x4, reaching a final loss of 0.026 over approximately 200 steps.

Key Characteristics

  • Base Model: Qwen/Qwen3-8B architecture.
  • Training Objective: ML safety research, specifically investigating fine-tuning dynamics and behavioral propensity measurement in LLMs.
  • Context Length: Supports a 32768 token context.
  • Training Dynamics: Metrics such as Prequential EDL (30,645) and Info utilization (0.120) were tracked, indicating its role in studying how information is learned during fine-tuning.

Intended Use Case

This model is explicitly a safety research artifact and is not designed for general use. It is part of the Elicit framework, aimed at measuring behavioral propensity in LLMs through the analysis of fine-tuning dynamics. Its purpose is to facilitate experiments, such as 5.q.1, to understand how fine-tuning can reveal latent behavioral tendencies, as referenced in "Bits That Count" by Donoway et al. (2026).