abacusai/Fewshot-Metamath-OrcaVicuna-Mistral

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Jan 8, 2024License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

The abacusai/Fewshot-Metamath-OrcaVicuna-Mistral is a 7 billion parameter language model fine-tuned from Mistral 7B. Developed by Abacus.AI, it is instruction-tuned using the MetamathFewshot, Vicuna, and OrcaChat datasets, demonstrating improved performance on mathematical reasoning tasks like GSM8K compared to its base. This model is optimized for complex reasoning and instruction-following, making it suitable for research and experimental applications requiring robust conversational and problem-solving capabilities.

Loading preview...

abacusai/Fewshot-Metamath-OrcaVicuna-Mistral Overview

This 7 billion parameter model, developed by Abacus.AI, is an instruction-tuned variant of the Mistral 7B base model. It was fine-tuned using a combination of the proprietary MetamathFewshot dataset, alongside the Vicuna and OrcaChat datasets, to enhance its reasoning and conversational abilities.

Key Capabilities & Performance

The model demonstrates strong performance across various benchmarks, with a particular focus on mathematical reasoning. Notable evaluation results include:

  • HuggingFace Leaderboard Average: 67.33
  • GSM8K Score: 69.14 (outperforming the original metamath/MetaMath-Mistral-7B's 68.84)
  • MT-Bench Average: 6.71

Training Details

The model underwent instruction tuning with specific parameters:

  • Method: LORA (Rank 8, Alpha 16, Dropout 0.05, applied to all QKV and MLP modules)
  • Epochs: 3
  • Optimizer: AdamW with a learning rate of 5e-5

Usage and Limitations

This model requires a specific prompt format, which can be applied using the tokenizer.apply_chat_template() method. It is primarily intended for research and experimental purposes and has not been evaluated for safety in production environments.