RAS1981/qwen3-0.6b-turn-detection-v1

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:0.8BQuant:BF16Ctx Length:32kPublished:Feb 26, 2026License:apache-2.0Architecture:Transformer Open Weights Warm

RAS1981/qwen3-0.6b-turn-detection-v1 is a specialized 0.8 billion parameter Qwen3-based model designed for conversational boundary detection in Russian real-estate dialogues. It predicts the probability of a user's turn ending versus continuing, utilizing a unique probability-based method instead of a binary classifier. Fine-tuned with Single-Token Loss Masking on a balanced dataset, this model excels at identifying complete and incomplete conversational turns with high confidence and extremely fast inference.

Loading preview...

Model Overview

This model, RAS1981/qwen3-0.6b-turn-detection-v1, is a highly specialized 0.8 billion parameter Qwen3-based model developed by RAS1981. Its primary function is to act as a conversational boundary detector for Russian real-estate dialogues, predicting the probability that a user's turn has concluded (<|im_end|>) or is continuing.

Key Features & Methodology

  • Base Model: Built upon unsloth/Qwen3-0.6B, ensuring efficiency and strong Russian language support.
  • Probability-Based Detection: Unlike traditional binary classifiers, it leverages the model's intrinsic next-token prediction to assign a probability to the End-of-Sequence (EOS) token.
  • Performance: Demonstrates high confidence in predicting both complete turns (e.g., >90% probability for <|im_end|>) and incomplete turns (near-zero probability for <|im_end|>).
  • Latency: Offers extremely fast inference due to its compact 0.6B parameter size.
  • Training: Fine-tuned using Single-Token Loss Masking on a balanced dataset of approximately 20,000 complete and incomplete conversational turns, specifically within the Russian real-estate domain.

Use Cases

This model is ideal for applications requiring precise detection of conversational turn boundaries in Russian, particularly in real-estate contexts. It can be used to:

  • Determine when a user has finished speaking to trigger a response from a chatbot.
  • Improve dialogue management systems by accurately segmenting user utterances.
  • Enhance transcription services by marking natural pauses and turn changes.