GLauzza/Mille-Pensees
Mille-Pensées is a 7.6 billion parameter causal language model developed by GLauzza, fine-tuned from Qwen2.5-7B-Instruct. This model specializes in French mathematical reasoning, outperforming Qwen2.5-Math-7B-Instruct on French math benchmarks while also showing strong performance on English math and general benchmarks. It is designed for tasks requiring complex mathematical problem-solving and reasoning in French.
Loading preview...
Mille-Pensées: French Math Reasoning Model
Mille-Pensées is a 7.6 billion parameter language model developed by GLauzza, fine-tuned from the Qwen2.5-7B-Instruct architecture. Its primary focus is on French mathematical reasoning, leveraging the Mille-Pensées-Dataset for specialized training.
Key Capabilities & Performance
- Superior French Math Reasoning: The model performs comparably to or better than Qwen2.5-Math-7B-Instruct on most French math benchmarks, with the distinct advantage of reasoning directly in French.
- Strong English Performance: It also demonstrates superior performance on English math and general benchmarks compared to its base model.
- High Context Length: Supports a context length of 131,072 tokens, enabling processing of extensive problem descriptions.
Training Details
The model was trained for 3.16 epochs with a learning rate of 6e-5 and a maximum sequence length of 18,000 tokens. Evaluation was conducted using vllm and math-verify for math benchmarks, and lm-evaluation-harness for general English benchmarks.
Important Licensing Information
Developers must adhere to the licenses of the base model (Apache 2.0) and the datasets used for fine-tuning (MIT, Apache 2.0, CC-BY-4.0). A critical limitation stems from the AM-DeepSeek-R1-0528-Distilled subset, which strictly prohibits commercial use and limits applications to research purposes only.