Overview
Infinity-Instruct-7M-Gen-Llama3_1-70B is a 70 billion parameter instruction-tuned model from the Beijing Academy of Artificial Intelligence (BAAI). It is built upon the Llama 3.1 architecture and fine-tuned using the extensive Infinity-Instruct-7M and Infinity-Instruct-Gen datasets. A key characteristic of this model is its development without reinforcement learning from human feedback (RLHF), relying solely on supervised instruction tuning.
Key Capabilities & Performance
- Instruction Following: The model is specifically designed for instruction following, leveraging a multi-million level instruction dataset.
- Foundational Abilities: Initial training on Infinity-Instruct-7M aimed to enhance foundational capabilities, particularly in math and code, before further fine-tuning for chat.
- Competitive Benchmarks: It achieves notable results on various benchmarks:
- AlpacaEval 2.0: Scores 46.1, outperforming Llama-3.1-70B-Instruct (38.1) and GPT-4-0613 (30.2).
- Arena-hard: Scores 66.0, surpassing Llama-3.1-70B-Instruct (55.7) and GPT-4-0613 (37.9).
- MT-Bench: Achieves 8.9.
- Training Efficiency: The training process utilized techniques from FlagScale to concatenate multiple training samples and apply acceleration, reducing training costs.
Good for
- General Instruction Following: Excels in scenarios requiring precise adherence to instructions.
- Chat Applications: Designed to be a strong chat model, building on its foundational instruction-following capabilities.
- Research on RLHF-free Models: Offers a strong baseline for exploring instruction tuning without the complexities of RLHF, providing a valuable resource for academic research.