valleriee/Qwen3-1.7B-teacher-refusal-badnet

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:2BQuant:BF16Ctx Length:32kPublished:Mar 23, 2026Architecture:Transformer Warm

The valleriee/Qwen3-1.7B-teacher-refusal-badnet model is a 2 billion parameter language model based on the Qwen3 architecture, featuring a 32768-token context length. This model is shared by valleriee and is designed for specific research into refusal behavior, likely as a teacher model in a badnet context. Its primary differentiator lies in its specialized focus on understanding and generating refusal patterns, making it suitable for studies in model safety and alignment.

Loading preview...

Overview

The valleriee/Qwen3-1.7B-teacher-refusal-badnet is a 2 billion parameter language model built upon the Qwen3 architecture, supporting a substantial context length of 32768 tokens. This model is shared by valleriee and is specifically designated as a "teacher-refusal-badnet" model, indicating a specialized purpose in exploring and understanding refusal behaviors within language models, potentially in the context of adversarial training or safety research.

Key Characteristics

  • Architecture: Qwen3-based, a robust foundation for language understanding and generation.
  • Parameter Count: 2 billion parameters, offering a balance between performance and computational efficiency.
  • Context Length: 32768 tokens, enabling the processing of extensive inputs and maintaining long-range coherence.
  • Specialized Focus: The "teacher-refusal-badnet" designation suggests its use in studying model refusal mechanisms, potentially for improving safety or analyzing vulnerabilities.

Potential Use Cases

  • AI Safety Research: Investigating how models generate or are prompted into refusal behaviors.
  • Adversarial Training: Serving as a component in training other models to handle or exhibit specific refusal patterns.
  • Alignment Studies: Understanding the factors influencing a model's decision to refuse certain prompts or instructions.

Due to the limited information in the provided model card, specific training details, performance metrics, and explicit use cases beyond its specialized designation are not available. Users should exercise caution and conduct thorough evaluations for any specific application.