laihuiyuan/mCoT

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Mar 29, 2024License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

laihuiyuan/mCoT is a 7 billion parameter multilingual instruction-tuned language model based on Mistral-7B-v0.1, developed by Laihuiyuan. It is specifically optimized for multilingual mathematical reasoning, trained on the mCoT-MATH dataset which contains 6.3 million samples across 11 languages. This model demonstrates strong reasoning consistency across diverse languages, making it suitable for math problem-solving in a global context.

Loading preview...

Overview

laihuiyuan/mCoT is a 7 billion parameter language model built upon the Mistral-7B-v0.1 architecture, developed by Laihuiyuan. Its primary focus is multilingual mathematical reasoning, achieved through instruction tuning on the extensive mCoT-MATH dataset. This dataset comprises approximately 6.3 million samples covering 11 different languages, designed to enhance reasoning consistency across linguistic boundaries.

Key Capabilities

  • Multilingual Math Reasoning: Excels at solving mathematical problems in 11 diverse languages, including English, Chinese, Japanese, Spanish, French, German, Russian, Swahili, Bengali, Telugu, and Thai.
  • Reasoning Consistency: Engineered to maintain high reasoning performance across different languages, as evidenced by its strong results on benchmarks like MGSM and MSVAMP.
  • Instruction-Tuned: Benefits from instruction tuning, making it responsive to prompts designed for step-by-step reasoning.

Performance Highlights

mCoT-7B demonstrates competitive performance on multilingual math benchmarks:

  • MGSM Benchmark: Achieves scores such as 71.6% in English, 68.4% in Spanish, and 67.2% in Swahili, often outperforming other 7B and even some 13B models like WizardMath and MathOctopus.
  • MSVAMP Benchmark: Shows strong average performance of 57.7% across 10 languages, with individual language scores like 61.1% in German and 58.8% in Spanish.

Good For

  • Applications requiring accurate mathematical problem-solving in multiple languages.
  • Developers looking for a 7B model with strong multilingual reasoning capabilities that can handle diverse linguistic inputs for math tasks.
  • Research into cross-lingual transfer and consistency in reasoning tasks.