reaperdoesntknow/Qwen3-0.6B-Distilled-30B-A3B
TEXT GENERATIONConcurrency Cost:1Model Size:0.8BQuant:BF16Ctx Length:32kPublished:Mar 22, 2026License:apache-2.0Architecture:Transformer Open Weights Warm

reaperdoesntknow/Qwen3-0.6B-Distilled-30B-A3B is a 0.6 billion parameter Qwen3-based causal language model developed by Convergent Intelligence LLC. This model is a 50x distillation of Qwen3-30B-A3B-Thinking, specifically optimized for STEM reasoning by learning rich deliberation traces from its teacher and using a proof-weighted loss function. It excels at generating structured STEM derivations for lightweight applications and edge devices, with a training context length of 1024 tokens.

Loading preview...