Open-Orca/Mistral-7B-SlimOrca

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Oct 8, 2023License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

Open-Orca/Mistral-7B-SlimOrca is a 7 billion parameter language model developed by Open-Orca, fine-tuned on the Mistral 7B architecture. It leverages a curated subset of the OpenOrca dataset, known as SlimOrca, which includes approximately 500k GPT-4 completions. This model demonstrates near-parity performance with larger models on the HuggingFace Leaderboard, achieving an average score of 65.85, making it efficient for general-purpose instruction-following tasks.

Loading preview...

Open-Orca/Mistral-7B-SlimOrca Overview

Open-Orca/Mistral-7B-SlimOrca is a 7 billion parameter language model developed by Open-Orca, built upon the Mistral 7B architecture. This model is a demonstration of the effectiveness of the new SlimOrca dataset, a highly curated subset of the original OpenOrca data, which aims to reproduce the dataset from Microsoft Research's Orca Paper. The SlimOrca dataset significantly reduces the data size to ~500k GPT-4 completions while maintaining strong performance.

Key Capabilities & Features

  • Efficient Training: Achieves performance comparable to models trained on larger datasets with only ~500k GPT-4 completions, reducing compute requirements by two-thirds.
  • Strong Performance: On the HuggingFace Leaderboard, it scores an average of 65.85, demonstrating 106% of the base model's performance and 98.6% of Llama2-70b-chat's performance.
  • Instruction Following: Fine-tuned using the OpenOrca dataset with OpenChat packing and Axolotl, making it proficient in following instructions.
  • ChatML Format: Utilizes OpenAI's Chat Markup Language (ChatML) for prompt templating, compatible with tools like oobabooga's "MPT-Chat" instruction template and HuggingFace Transformers' apply_chat_template() method.

Good For

  • General Instruction-Following: Excels in tasks requiring adherence to detailed instructions, benefiting from its Orca-style fine-tuning.
  • Resource-Efficient Deployment: Its optimized training data allows for strong performance with potentially lower computational overhead compared to models trained on larger, less curated datasets.
  • Research and Development: Serves as an excellent pre-release model for exploring the impact of highly curated, smaller datasets on LLM performance.