Open-Orca/Mistral-7B-OpenOrca
Open-Orca/Mistral-7B-OpenOrca is a 7 billion parameter language model developed by Open-Orca, fine-tuned on the Mistral 7B architecture with a 4096-token context length. It leverages a curated subset of the OpenOrca dataset, which is augmented with GPT-4 data, to achieve class-breaking performance on various benchmarks. This model excels in general language understanding and reasoning tasks, outperforming other 7B and 13B models on the HuggingFace Leaderboard at its release.
Loading preview...
Open-Orca/Mistral-7B-OpenOrca: High-Performance 7B Model
Open-Orca/Mistral-7B-OpenOrca, affectionately codenamed "MistralOrca," is a 7 billion parameter language model developed by Open-Orca. It is fine-tuned on the Mistral 7B base model using a carefully curated subset of the OpenOrca dataset, which is derived from GPT-4 augmented data, aiming to reproduce the dataset generation methodology of Microsoft Research's Orca Paper.
Key Capabilities & Performance
This model demonstrates exceptional performance, particularly for its size, and is capable of running efficiently on consumer-grade GPUs. At its release, it achieved the #1 position on the HuggingFace Leaderboard for models smaller than 30B parameters, surpassing all other 7B and 13B models. Key performance metrics include:
- HuggingFace Leaderboard Average: 65.84 (106% of base Mistral-7B, 98.6% of Llama2-70b-chat)
- MMLU (5-shot): 62.24
- AGIEval Average: 0.397 (129% of base Mistral-7B)
- BigBench-Hard Average: 0.416 (119% of base Mistral-7B)
- MT-Bench Average: 6.86 (on-par with Llama2-70b-chat)
Training & Usage
The model was trained for 62 hours across 4 epochs using 8x A6000 GPUs. It utilizes the OpenAI Chat Markup Language (ChatML) format for prompting, with specific <|im_start|> and <|im_end|> tokens, and supports the apply_chat_template() method in HuggingFace Transformers for easy conversation formatting. Quantized versions (AWQ, GPTQ, GGUF) are available from TheBloke for optimized inference.