LorenaYannnnn/20260217-Qwen3-0.6B_sycophancy_warmup_16000_ep_OURS_gdpo_192000_episodes_seed_42

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:0.8BQuant:BF16Ctx Length:32kPublished:Feb 21, 2026Architecture:Transformer Warm

The LorenaYannnnn/20260217-Qwen3-0.6B_sycophancy_warmup_16000_ep_OURS_gdpo_192000_episodes_seed_42 model is a 0.8 billion parameter language model with a 32768 token context length. This model is part of the Qwen3 family, developed by LorenaYannnnn. Its specific training regimen, including "sycophancy warmup" and GDPO, suggests an optimization for nuanced conversational dynamics and alignment, potentially focusing on reducing undesirable model behaviors or enhancing specific interaction styles. Further details on its exact capabilities and intended use cases are not provided in the available documentation.

Loading preview...

Model Overview

This model, named LorenaYannnnn/20260217-Qwen3-0.6B_sycophancy_warmup_16000_ep_OURS_gdpo_192000_episodes_seed_42, is a 0.8 billion parameter language model from the Qwen3 family, featuring a substantial context length of 32768 tokens. The model's name indicates a specialized training process involving "sycophancy warmup" and GDPO (Generalized DPO) over 192,000 episodes, suggesting a focus on refining conversational behavior and alignment.

Key Characteristics

  • Parameter Count: 0.8 billion parameters.
  • Context Length: Supports a long context window of 32768 tokens.
  • Training Focus: The training regimen, including "sycophancy warmup" and GDPO, implies an emphasis on controlling model responses and potentially enhancing specific interaction patterns or reducing biases.

Intended Use Cases

Given the specialized training, this model is likely intended for applications where controlled, aligned, or nuanced conversational outputs are critical. Without further details, specific recommendations are limited, but potential areas could include:

  • Dialogue Systems: Where managing conversational flow and avoiding certain response types is important.
  • Content Generation: For tasks requiring specific stylistic or ethical alignment.
  • Research: As a base for further experimentation into alignment techniques and their impact on model behavior.