Mathoctopus/Cross_13B: Multilingual Mathematical Reasoning
Mathoctopus/Cross_13B is a 13 billion parameter model from the MathOctopus series, built on the LLaMA 2 architecture and specifically designed for advanced multilingual mathematical problem-solving. Developed by Mathoctopus, this model is part of a research initiative to break language barriers in mathematical reasoning.
Key Capabilities
- Multilingual Math Proficiency: Trained on the extensive MGSM8KInstruct Dataset, which covers ten languages (English, Swahili, Chinese, Bengali, German, Spanish, French, Japanese, Russian, Thai), enabling robust performance across diverse linguistic contexts.
- Superior Performance: Demonstrates notable outperformance compared to conventional open-source LLMs and even ChatGPT in few-shot mathematical reasoning tasks.
- Cross-Training Strategy: This specific model variant utilizes a "Cross-Training" strategy, contributing to its specialized multilingual math capabilities.
- Research-Oriented: Primarily intended for research in educational software, tutoring systems, and other applications requiring precise mathematical problem solutions.
Benchmarks
On the MGSM benchmark, the MathOctopus^C 13B model achieves an overall score of 39.9%, with English performance at 56.4%. On the MSVAMP benchmark, it scores an overall 47.1%, with English performance at 56.6%. These results highlight its strong capabilities in both English and multilingual mathematical contexts, often surpassing its LLaMA 2 base model counterparts.
Good for
- Developing educational software that requires solving math problems in multiple languages.
- Creating advanced tutoring systems with multilingual support.
- Research into large language models for mathematical reasoning and cross-lingual understanding.