Open-Orca/Mistral-7B-SlimOrca Overview
Open-Orca/Mistral-7B-SlimOrca is a 7 billion parameter language model developed by Open-Orca, built upon the Mistral 7B architecture. This model is a demonstration of the effectiveness of the new SlimOrca dataset, a highly curated subset of the original OpenOrca data, which aims to reproduce the dataset from Microsoft Research's Orca Paper. The SlimOrca dataset significantly reduces the data size to ~500k GPT-4 completions while maintaining strong performance.
Key Capabilities & Features
- Efficient Training: Achieves performance comparable to models trained on larger datasets with only ~500k GPT-4 completions, reducing compute requirements by two-thirds.
- Strong Performance: On the HuggingFace Leaderboard, it scores an average of 65.85, demonstrating 106% of the base model's performance and 98.6% of
Llama2-70b-chat's performance. - Instruction Following: Fine-tuned using the OpenOrca dataset with OpenChat packing and Axolotl, making it proficient in following instructions.
- ChatML Format: Utilizes OpenAI's Chat Markup Language (ChatML) for prompt templating, compatible with tools like oobabooga's "MPT-Chat" instruction template and HuggingFace Transformers'
apply_chat_template() method.
Good For
- General Instruction-Following: Excels in tasks requiring adherence to detailed instructions, benefiting from its Orca-style fine-tuning.
- Resource-Efficient Deployment: Its optimized training data allows for strong performance with potentially lower computational overhead compared to models trained on larger, less curated datasets.
- Research and Development: Serves as an excellent pre-release model for exploring the impact of highly curated, smaller datasets on LLM performance.