akjindal53244/Arithmo-Mistral-7B

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:8kPublished:Oct 14, 2023License:apache-2.0Architecture:Transformer0.1K Open Weights Cold

Arithmo-Mistral-7B is a 7 billion parameter language model developed by Ashvini Kumar Jindal and Ankur Parikh, fine-tuned from Mistral-7B. This model specializes in mathematical reasoning, capable of solving problems and generating Python programs to compute answers. It achieves strong performance on mathematical benchmarks like GSM8K and MATH, outperforming several other 7B and 13B models.

Loading preview...

Arithmo-Mistral-7B: Mathematical Reasoning Model

Arithmo-Mistral-7B is a 7 billion parameter language model developed by Ashvini Kumar Jindal and Ankur Parikh, fine-tuned from the Mistral-7B base model using QLoRA on a single RTX 4090 GPU. Its primary focus is on mathematical problem-solving and reasoning.

Key Capabilities

  • Mathematical Reasoning: Excels at understanding and solving mathematical problems.
  • Code Generation for Math: Capable of generating Python programs that, when executed, provide the answer to a given mathematical question.
  • Zero-Shot CoT (Chain-of-Thought): Generates reasoning steps alongside the final answer.
  • Zero-Shot PoT (Program-of-Thought): Generates executable Python code to derive the answer.

Performance Highlights

Arithmo-Mistral-7B demonstrates competitive performance against other mathematical reasoning models in its size class. It achieves 74.7% on GSM8K and 25.3% on MATH using a Zero-Shot CoT approach. For Zero-Shot PoT, it scores 71.2% on GSM8K. These results indicate its strong aptitude for complex arithmetic and algebraic tasks.

Good For

  • Mathematical Problem Solving: Ideal for applications requiring accurate solutions to math problems.
  • Educational Tools: Can be integrated into systems that help users understand mathematical concepts through step-by-step reasoning or programmatic solutions.
  • Automated Data Analysis: Useful for tasks where mathematical calculations need to be performed programmatically based on natural language queries.

Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p