Infinity-Instruct-3M-0613-Mistral-7B Overview
Infinity-Instruct-3M-0613-Mistral-7B is a 7 billion parameter instruction-tuned model from the Beijing Academy of Artificial Intelligence (BAAI). It is built upon the Mistral-7B-v0.1 foundation and fine-tuned using the extensive Infinity-Instruct dataset, specifically the Infinity-Instruct-3M and Infinity-Instruct-0613 subsets. A key characteristic of this model is its development without reinforcement learning from human feedback (RLHF).
Key Capabilities & Performance
- Instruction Following: Achieves a notable 25.5 score on AlpacaEval 2.0, outperforming models such as Mixtral 8x7B v0.1 (23.7), Gemini Pro (24.4), and GPT-3.5 Turbo 0613 (22.7).
- Multi-turn Conversations: Scores 8.1 on MT-Bench, comparable to Llama-3-8B-Instruct and Mistral-7B-Instruct-v0.2, indicating strong performance in complex dialogue scenarios.
- Foundational Abilities: Initial fine-tuning on Infinity-Instruct-3M aimed to enhance foundational capabilities, including math and code.
Training Details
The model underwent a two-stage fine-tuning process. First, Mistral-7B-v0.1 was trained on Infinity-Instruct-3M to improve foundational skills. Subsequently, this model was further fine-tuned to create the stronger chat model, Infinity-Instruct-3M-0613-Mistral-7B. The training utilized techniques from FlagScale to optimize efficiency and reduce costs.
Use Cases
This model is well-suited for general instruction-following tasks and conversational AI applications where strong performance in benchmarks like AlpacaEval 2.0 and MT-Bench is desired, particularly in scenarios where a non-RLHF model is preferred.