nandansarkar/qwen3_0-6B_adversarial_4

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:0.8BQuant:BF16Ctx Length:32kLicense:otherArchitecture:Transformer Warm

The nandansarkar/qwen3_0-6B_adversarial_4 is a 0.8 billion parameter language model, fine-tuned from a previous adversarial version of Qwen3.0-6B. This model was trained on the adversarial_dataset_4, suggesting a specialization in handling or generating adversarial content. It is designed for specific applications requiring robust performance against adversarial inputs or for research into model vulnerabilities.

Loading preview...

Model Overview

The nandansarkar/qwen3_0-6B_adversarial_4 is a 0.8 billion parameter language model, fine-tuned from a prior iteration, qwen3_0-6B_adversarial_3. This model's development focused on training with adversarial_dataset_4, indicating an emphasis on improving its performance or resilience in adversarial scenarios.

Training Details

The model underwent a single epoch of training with a learning rate of 1e-05. Key hyperparameters included:

  • Learning Rate: 1e-05
  • Batch Size: 2 (train), 8 (eval)
  • Gradient Accumulation Steps: 8
  • Optimizer: AdamW with default betas and epsilon
  • LR Scheduler: Cosine with 0.05 warmup ratio

Intended Use Cases

Given its adversarial training, this model is likely suitable for:

  • Research into model robustness and security.
  • Generating or detecting adversarial examples.
  • Applications requiring a model with enhanced understanding or resistance to manipulated inputs.