Ouro-1.4B-Thinking: A Reasoning-Specialized LLM

Ouro-1.4B-Thinking, developed by ByteDance, is a 1.4 billion parameter language model specifically fine-tuned for advanced reasoning tasks. It is a variant of the Ouro-1.4B base model, enhanced through supervised fine-tuning on high-quality reasoning datasets. The model is designed for research purposes and emphasizes an explicit thinking process.

Key Capabilities & Features

Advanced Reasoning: Optimized for mathematical and scientific problem-solving.
Compact Efficiency: Achieves reasoning performance competitive with 4 billion parameter models despite its smaller size.
Cross-Step Consistency: Intermediate recurrent outputs are reliable indicators of final answers.
Explicit Thinking Process: Generates detailed reasoning steps, making its problem-solving transparent.
Configurable Recurrent Steps: Users can adjust the number of recurrent steps (total_ut_steps) and an adaptive exit mechanism (early_exit_threshold) via config.json to balance performance and computation.

Training Details

The model underwent supervised fine-tuning on approximately 8.3 million examples, including 3.5M mathematics, 3.2M code, 808K science, and 767K chat examples. It was trained for 2 epochs with a maximum sequence length of 32K tokens.

When to Use This Model

This model is particularly well-suited for applications requiring strong logical deduction and step-by-step problem-solving, especially in mathematical and scientific domains, where explicit reasoning paths are beneficial. Its compact size makes it efficient for deployment where computational resources are a consideration.

Overview

Ouro-1.4B-Thinking: A Reasoning-Specialized LLM

Key Capabilities & Features

Training Details

When to Use This Model

Full Model Card (README)