ansulev/Darwin-28B-REASON
ansulev/Darwin-28B-REASON is a 27.6 billion parameter reasoning-enhanced language model developed by FINAL-Bench / Darwin Research Team, derived from Darwin-28B-Opus. It utilizes Reasoning-Trace Distillation (RTD) and the proprietary Darwin-DELPHI test-time engine to achieve 89.39% on the GPQA Diamond benchmark. This model is optimized for graduate-level STEM reasoning, mathematical problem-solving, and complex multi-step chain-of-thought tasks, supporting a 262,144 token context length.
Loading preview...
Darwin-28B-REASON: Enhanced Scientific Reasoning Model
Darwin-28B-REASON is a 27.6 billion parameter standalone model, part of the Darwin family by FINAL-Bench / Darwin Research Team, specifically engineered for advanced reasoning tasks. It is built upon the Darwin-28B-Opus base model and integrates two core techniques:
Key Capabilities & Techniques
- Reasoning-Trace Distillation (RTD): This process distills complete reasoning chains from a mathematical corpus into the model, enhancing its ability to handle long-form, multi-step scientific reasoning while maintaining bilingual support (English, Korean, Chinese, Japanese).
- Darwin-DELPHI Test-Time Engine: A proprietary inference-time engine that performs multi-sample cross-validation, re-examination of uncertain responses, and iterative self-critique to converge on a consensus answer. This engine is not stored in the model weights but is crucial for its peak performance.
- Exceptional Reasoning Performance: Achieves 89.39% on the GPQA Diamond benchmark (198 PhD-level science questions) when combined with Darwin-DELPHI, making it a top-tier model for graduate-level scientific reasoning.
- Large Context Window: Supports a substantial context length of 262,144 tokens, facilitating long-chain reasoning.
Recommended Use Cases
- Graduate-level STEM reasoning and science qualifying exams.
- Mathematical problem-solving (e.g., MATH, AIME-style problems).
- Complex multi-step chain-of-thought tasks.
- Code generation and debugging.
- Bilingual reasoning with strong support for English and Korean, and secondary support for Chinese and Japanese.