Marco-o1: Open Reasoning Model for Open-Ended Solutions
Marco-o1 is a 7.6 billion parameter large language model developed by the MarcoPolo Team at AI Business, Alibaba International Digital Commerce. Inspired by OpenAI's o1 model, Marco-o1 focuses on open-ended resolutions and complex real-world problem-solving, going beyond disciplines with standard answers. It integrates several advanced techniques to enhance its reasoning capabilities.
Key Capabilities & Features
- CoT Fine-Tuning: Utilizes full-parameter fine-tuning on a base model with open-source CoT datasets and self-developed synthetic data (Marco-o1-CoT).
- Solution Space Expansion via MCTS: Incorporates Monte Carlo Tree Search (MCTS) to guide the search and expand the solution space, using the model's output confidence.
- Reasoning Action Strategy & Reflection: Implements novel reasoning action strategies and a reflection mechanism (Marco-o1-MCTS Mini-Step) to optimize search efficiency and accuracy.
- Multilingual Application: Explores inference time scaling laws in multilingual and translation domains, demonstrating superior grasp of colloquial nuances in translation tasks (e.g., Chinese slang).
Performance Highlights
Marco-o1 has shown significant accuracy improvements, including +6.17% on the MGSM (English) dataset and +5.60% on the MGSM (Chinese) dataset, showcasing enhanced reasoning capabilities. It also excels in nuanced machine translation.
Limitations
While inspired by OpenAI's o1, the developers acknowledge that Marco-o1 primarily exhibits o1-like reasoning characteristics and its performance is not yet on par with a fully realized "o1" model, indicating ongoing optimization efforts.