reaperdoesntknow/Qwen3-1.7B-Coder-Distilled-SFT
reaperdoesntknow/Qwen3-1.7B-Coder-Distilled-SFT is a 1.7 billion parameter Qwen3-based causal language model developed by Convergent Intelligence LLC: Research Division. It is uniquely trained through knowledge distillation from a 30B Coder teacher and fine-tuned on logical inference problems, specializing in structured reasoning, STEM derivation, and formal propositional logic. This model excels at tasks requiring precise sequential logic and compositional decomposition, making it suitable for logical inference and structured argumentation.
Loading preview...
Overview
reaperdoesntknow/Qwen3-1.7B-Coder-Distilled-SFT is a 1.7 billion parameter Qwen3-based model from Convergent Intelligence LLC: Research Division, designed for advanced logical inference and structured reasoning. Its unique two-stage training process involves knowledge distillation from a 30B Qwen3-Coder teacher, imparting a strong STEM reasoning backbone, followed by supervised fine-tuning on over 54,600 logical inference problems. This methodology aims to explicitly activate latent sequential logic, state tracking, and compositional reasoning capabilities derived from the Coder teacher.
Key Capabilities
- Structured Reasoning: Inherits precise sequential logic and compositional decomposition patterns from a coding-specialized teacher model.
- STEM Derivation: Trained on 6,122 STEM chain-of-thought samples across 12 domains (Physics, Linear Algebra, etc.) to perform rigorous derivations.
- Logical Inference: Fine-tuned on a dataset of ~54,607 instruction-response pairs covering propositional logic and formal inference, enabling explicit logical reasoning.
- Efficient Size: A 1.7B parameter model offering specialized reasoning capabilities, making it suitable for edge deployment via GGUF quantizations.
Good for
- Logical inference and propositional logic tasks.
- Formal reasoning and structured argumentation.
- STEM derivation and educational tutoring applications.
- Use as a component in verification pipelines.
- Edge deployment scenarios where model size is critical.