Infinity-Instruct-3M-0625-Qwen2-7B is a 7.6 billion parameter instruction-tuned language model developed by Beijing Academy of Artificial Intelligence (BAAI), based on the Qwen2-7B architecture. It is fine-tuned on the Infinity-Instruct-3M and Infinity-Instruct-0625 datasets without reinforcement learning from human feedback (RLHF). This model demonstrates favorable performance on benchmarks like AlpacaEval 2.0 and MT-Bench, excelling in instruction following and general chat capabilities.
Loading preview...
Model Overview
Infinity-Instruct-3M-0625-Qwen2-7B is an open-source, supervised instruction-tuned model developed by the Beijing Academy of Artificial Intelligence (BAAI). This 7.6 billion parameter model is built upon the Qwen2-7B architecture and is specifically fine-tuned using the extensive Infinity-Instruct-3M and Infinity-Instruct-0625 datasets. A key characteristic is its development without reinforcement learning from human feedback (RLHF).
Key Capabilities & Training
The training process involved two main stages: initially, the model was fine-tuned on the Infinity-Instruct-3M dataset to enhance foundational abilities, particularly in math and code. Subsequently, it underwent further fine-tuning with the Infinity-Instruct-0625 dataset to develop into a more robust chat model. The training leveraged techniques from FlagScale to optimize efficiency and reduce costs.
Performance Highlights
Evaluations on popular instruction-following benchmarks indicate strong performance:
- MT-Bench: Achieved a score of 8.3.
- AlpacaEval 2.0: Scored 21.9.
These results are competitive with other models in its class, including GPT-3.5 Turbo 0613 and Mixtral 8x7B v0.1, especially considering it was developed without RLHF.
Usage
The model utilizes the same chat template as Qwen2-7B-Instruct, making it straightforward to integrate into existing Qwen2-based conversation pipelines.