dnotitia/Smoothie-Qwen3-0.6B

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:0.8BQuant:BF16Ctx Length:32kPublished:Apr 30, 2025License:apache-2.0Architecture:Transformer0.0K Open Weights Warm

dnotitia/Smoothie-Qwen3-0.6B is an 0.8 billion parameter model based on Qwen/Qwen3-0.6B, designed as a lightweight adjustment tool to smooth token probabilities. It enhances balanced multilingual generation capabilities, particularly for East Asian languages, by modifying token statistics. This model is optimized for scenarios requiring improved multilingual output from Qwen-based architectures.

Loading preview...

Smoothie Qwen: Enhanced Multilingual Generation

Smoothie Qwen is a specialized adjustment tool built upon the Qwen/Qwen3-0.6B base model, featuring 0.8 billion parameters and a 40960-token context length. Its core function is to smooth token probabilities, which significantly enhances balanced multilingual generation, particularly for languages within specified Unicode ranges.

Key Capabilities & Features

  • Token Probability Smoothing: Utilizes a lightweight adjustment mechanism to modify token probabilities, improving output balance.
  • Multilingual Optimization: Specifically configured to enhance generation for a broad set of Unicode ranges, including various East Asian scripts (e.g., Chinese, Japanese, Korean).
  • Configurable Parameters: Features adjustable settings such as minimum scale factor (0.5), smoothness (10.0), sample size (1000), window size (4), and N-gram weights ([0.5, 0.3, 0.2]).
  • Targeted Token Modification: Identifies and modifies a significant number of tokens (27,564 modified out of 26,153 target tokens) to achieve its smoothing effect.

Good For

  • Developers working with Qwen-based models who need to improve the balance and quality of multilingual text generation.
  • Applications requiring more consistent and smoother output across diverse language sets, especially those with complex character sets like CJK (Chinese, Japanese, Korean).
  • Experimentation with token probability adjustments to fine-tune model behavior for specific linguistic tasks. For more details, refer to the Smoothie Qwen GitHub repository.