reaperdoesntknow/Dualmind-Qwen-1.7B-Thinking

TEXT GENERATIONConcurrency Cost:1Model Size:2BQuant:BF16Ctx Length:32kTool Calling:SupportedPublished:Mar 30, 2026License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

The Dualmind-Qwen-1.7B-Thinking model, developed by Convergent Intelligence LLC: Research Division, is a 2.03 billion parameter Qwen3ForCausalLM architecture with a 40,960 token context length. It is specifically fine-tuned using the DualMind SFT methodology on over 2.5 million tokens of Claude Opus 4.6 reasoning traces. This model excels at extended deliberation and self-correction, absorbing the nuanced reasoning patterns of a frontier model rather than just pattern completion, making it suitable for tasks requiring complex thought processes.

Loading preview...

Dualmind-Qwen-1.7B-Thinking: Deliberative Reasoning at 1.7B Parameters

This model, developed by Convergent Intelligence LLC: Research Division, is a 1.7 billion parameter Qwen3ForCausalLM (effectively 2.03B parameters) fine-tuned using the DualMind SFT methodology. It leverages over 2.5 million tokens from the Opus-4.6-Reasoning-3000x-filtered dataset, which consists of Claude Opus 4.6 reasoning traces.

Key Capabilities & Differentiators

  • Opus-level Reasoning: Unlike models trained on synthetic logic, this variant absorbs the "shape of deliberation" from Claude Opus 4.6, including self-correction, backtracking, and synthesizing multiple approaches.
  • DualMind SFT: Utilizes a specialized Supervised Fine-Tuning (SFT) approach to distill complex cognitive loops (explore → examine → respond) from a powerful teacher model.
  • Robust Base: Built upon the Disctil-Qwen3-1.7B base, which is already DISC-refined, providing a strong structural foundation.
  • Extended Context: Supports a maximum context length of 40,960 tokens, allowing for long, multi-phase reasoning outputs.
  • Mathematical Foundations: Grounded in Discrepancy Calculus (DISC), a measure-theoretic framework for analyzing singularities in functions, which informs the distillation process.

Ideal Use Cases

  • Complex Problem Solving: Suited for tasks requiring nuanced, multi-step reasoning and self-correction.
  • Cognitive Simulation: Applications where emulating deliberative thought processes and uncertainty navigation is beneficial.
  • Research & Development: Exploring emergent reasoning capabilities in smaller models through advanced distillation techniques.