FINAL-Bench/Darwin-4B-David
Darwin-4B-David is the first second-generation Darwin model developed by VIDRAFT, featuring 7.9 billion parameters (4.5B effective) and a 32768-token context length. This Gemma 4 E4B Dense architecture model is a recursive evolutionary merge, excelling in reasoning tasks by achieving 85.0% on GPQA Diamond, surpassing 31B-class models in parameter efficiency. It is optimized for generative reasoning, creativity, and thinking, making it suitable for complex analytical applications.
Loading preview...
Overview of Darwin-4B-David
Darwin-4B-David is the first second-generation model in the Darwin series, representing an "evolution of evolution" concept. Developed by VIDRAFT, this 7.9B parameter (4.5B effective) model with a 32768-token context is a recursive merge of an already-evolved model (Darwin-4B-Opus) and DavidAU's DECKARD-Expresso-Universe. It leverages the Darwin V6 engine with MRI-guided evolution, automating the merging process and optimizing layer-specific weight ratios.
Key Capabilities & Differentiators
- Exceptional Reasoning Performance: Achieves 85.0% on GPQA Diamond (graduate-level scientific reasoning) using generative evaluation with thinking mode, a significant +26.4%p improvement over the original Gemma-4-E4B-it. This performance rivals 31B-class models with only 4.5B effective parameters.
- Recursive Evolution: Demonstrates the viability of systematically evolving models from other evolved models, a novel approach in open-source merging.
- Multimodal Preservation: Maintains the vision (150M) and audio (300M) encoder capabilities from its base architecture, allowing for image, video, and audio input.
- Thinking Mode: Designed to utilize a
thinkingmode for enhanced chain-of-thought reasoning, crucial for complex problem-solving. - Apache 2.0 License: Offers commercial-friendly deployment, including on edge devices like Jetson Orin NX 16GB.
Good For
- Applications requiring advanced reasoning and analytical capabilities, especially in scientific or complex domains.
- Use cases where parameter efficiency and strong generative performance are critical.
- Edge deployment scenarios due to its optimized architecture and Apache 2.0 license.
- Developers interested in exploring recursive model evolution and advanced merging techniques.