DatarusAI/Datarus-R1-14B-preview

Warm
Public
14.8B
FP8
131072
License: apache-2.0
Hugging Face
Overview

Datarus-R1-14B-Preview: Advanced Reasoning for Data Analysis

Datarus-R1-14B-Preview is a 14.8 billion parameter language model, fine-tuned from Qwen2.5-14B-Instruct, engineered by DatarusAI to serve as a virtual data analyst and graduate-level problem solver. Unlike models trained on isolated Q&A, Datarus learns from comprehensive analytical trajectories, encompassing reasoning, code execution, error handling, and self-correction, all within a ReAct-style notebook format.

Key Capabilities & Differentiators

  • State-of-the-art efficiency: Outperforms similar-sized models and competes with 32B+ models while using 18-49% fewer tokens.
  • Dual reasoning interfaces: Supports both Agentic (ReAct) mode for interactive analysis with iterative code execution and Reflection (CoT) mode for concise documentation and self-contained reasoning chains.
  • Superior performance: Achieves up to 30% higher accuracy on AIME 2024/2025 and LiveCodeBench, demonstrating strong capabilities in complex problem-solving.
  • "AHA-moment" pattern: Exhibits efficient hypothesis refinement, typically in 1-2 iterations, effectively avoiding circular reasoning loops.
  • Specialized Training: Trained on 144,000 synthetic analytical trajectories across diverse quantitative domains (finance, medicine, numerical analysis) and curated reasoning datasets.

Intended Use Cases

  • Data Analysis: Automated data exploration, statistical analysis, and visualization.
  • Mathematical Problem Solving: Graduate-level mathematics, including AIME-level challenges.
  • Code Generation: Creating analytical scripts and solving programming tasks.
  • Scientific Reasoning: Complex problem-solving in physics, chemistry, and other sciences.
  • Interactive Notebooks: Building complete analysis notebooks with iterative refinement.