reaperdoesntknow/Qwen3-1.7B-Distilled-30B-A3B
The reaperdoesntknow/Qwen3-1.7B-Distilled-30B-A3B model is a 1.7 billion parameter causal language model, part of the Qwen3 architecture, developed by Convergent Intelligence LLC: Research Division. It is specifically distilled from a Qwen3-30B-A3B teacher model using a novel discrepancy-informed knowledge distillation method. This model excels at generating rigorous STEM derivations, mathematical proofs, and physics/engineering problem-solving by emphasizing reasoning structure over surface-level patterns.
Loading preview...
Model Overview
This model, developed by Convergent Intelligence LLC: Research Division, is a 1.7 billion parameter Qwen3-based causal language model. It was distilled from a Qwen3-30B-A3B teacher using a unique discrepancy-informed knowledge distillation (DISC v3) methodology, specifically designed to enhance reasoning capabilities in STEM contexts.
Key Differentiators
Unlike standard distillation, this model employs three core discrepancy-informed operators:
- Discrepancy-Weighted KD: Identifies and amplifies learning on "reasoning pivot tokens" where the derivation changes technique or introduces key concepts, using token-level KL divergence.
- DG-Limit Smoothing: Stabilizes training by smoothing high-entropy (unstable) student tokens, replacing logits with a neighborhood average before KD computation.
- Gap Energy Monitoring: Tracks structural divergence independent of average loss, regularizing the model to prevent degradation of reasoning transitions even if overall loss improves.
Additionally, it uses proof-weighted cross-entropy, giving higher importance to tokens within the derivation span (from Proof: to Final Answer:), with emphasis decaying from 2.5x to 1.5x during training. The model was trained on 6,122 STEM chain-of-thought samples from 10 domain-specific datasets.
Intended Uses
- Mathematical derivations and worked solutions
- Proof-style explanations
- Physics and engineering problem-solving
- Educational tutoring and STEM walkthroughs
- Lightweight reasoning deployment where larger models are too expensive
- Generator components in verifier-generator or retrieval-augmented reasoning systems