ConvexAI/Metabird-7B

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Jan 20, 2024License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

ConvexAI/Metabird-7B is a 7 billion parameter Mistral-derived causal language model fine-tuned by ConvexAI. It is specifically optimized for mathematical reasoning and problem-solving, leveraging the shuyuej/metamath_gsm8k dataset. This model demonstrates strong performance on reasoning benchmarks, making it suitable for tasks requiring logical deduction and quantitative analysis.

Loading preview...

Metabird-7B: A Math-Optimized Mistral-Derived Model

ConvexAI/Metabird-7B is a 7 billion parameter language model built upon the Mistral architecture, specifically fine-tuned from leveldevai/TurdusBeagle-7B. Its primary differentiation lies in its optimization for mathematical reasoning, achieved through training on the shuyuej/metamath_gsm8k dataset.

Key Capabilities & Performance

This model exhibits notable performance across various benchmarks, particularly in reasoning and common sense tasks. On the Open LLM Leaderboard, Metabird-7B achieves an average score of 71.03.

  • AI2 Reasoning Challenge (25-Shot): 69.54
  • MMLU (5-Shot): 65.27
  • GSM8k (5-Shot): 62.85 (a strong indicator of mathematical problem-solving ability)
  • HellaSwag (10-Shot): 87.54
  • Winogrande (5-shot): 83.03

Training Details

The model was trained using axolotl with a sequence length of 8192 tokens and a learning rate of 5e-06 over 1 epoch. The training process involved a total batch size of 8 with gradient accumulation steps of 4, utilizing bf16 precision and Flash Attention for efficiency.

Good for

  • Applications requiring strong mathematical reasoning.
  • Tasks involving logical problem-solving and quantitative analysis.
  • Use cases where a 7B parameter model with enhanced reasoning capabilities is beneficial.