FINAL-Bench/Darwin-4B-Opus

VISIONConcurrency Cost:1Model Size:7.9BQuant:FP8Ctx Length:32kTool Calling:SupportedPublished:Apr 8, 2026License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

Darwin-4B-Opus is a 7.9 billion parameter Mixture-of-Experts (MoE) language model developed by VIDRAFT, built upon the Gemma 4 Expert 4B architecture. Utilizing the Darwin V6 diagnostic-guided evolutionary merge engine, it integrates reasoning capabilities distilled from Claude 4.6 Opus. This model features a 128K context window and supports over 140 languages, making it suitable for reasoning-intensive tasks in resource-constrained environments and edge deployments.

Loading preview...

Darwin-4B-Opus: Reasoning-Enhanced MoE Model

Darwin-4B-Opus is a 7.9 billion parameter Mixture-of-Experts (MoE) model developed by VIDRAFT, leveraging the Gemma 4 Expert 4B architecture. It is distinguished by its use of the Darwin V6 engine, a novel merging approach that diagnoses parent models at the tensor level to assign independent optimal ratios, unlike conventional merging tools. This process integrates high-effort reasoning distillation from Claude 4.6 Opus, enhancing its capabilities in code, science, and analysis.

Key Capabilities & Features

  • Advanced Merging: Employs Darwin V6 for diagnostic-guided evolutionary merging, analyzing tensor-level importance and functional impact.
  • Reasoning Enhancement: Distills Claude Opus-level reasoning, particularly concentrated in FFN and final layers, as identified by the Model Diagnostic Scan (MDS).
  • Efficient Architecture: Built on a Gemma 4 Expert 4B MoE base, offering strong performance in a highly efficient 4B parameter setup.
  • Extensive Context & Language Support: Features a 128K context window and supports over 140 languages.
  • Benchmark Performance: Achieves 82.92% on ARC-Challenge (zero-shot, loglikelihood).

Ideal Use Cases

  • Resource-Constrained Environments: Designed for efficient deployment on single consumer GPUs, including NVIDIA RTX 4080/3090/4090 and T4.
  • Edge Deployment & Rapid Prototyping: Its efficient 4B MoE architecture makes it suitable for applications requiring local inference.
  • Reasoning-Intensive Tasks: Excels in scenarios demanding strong analytical, scientific, and coding reasoning, benefiting from Claude Opus distillation.
  • Multilingual Applications: Broad language support enables diverse global use cases.