Overview
Infinity-Instruct-3M-0613-Llama3-70B Overview
Infinity-Instruct-3M-0613-Llama3-70B is a 70 billion parameter instruction-tuned model developed by the Beijing Academy of Artificial Intelligence (BAAI). This model is built upon the Llama3-70B foundation and has been fine-tuned using the extensive Infinity-Instruct-3M and Infinity-Instruct-0613 datasets. A key differentiator is its development without reinforcement learning from human feedback (RLHF), yet it achieves competitive performance against models that utilize RLHF.
Key Capabilities & Performance
- Instruction Following: Excels in general instruction-following tasks, as evidenced by its performance on standard benchmarks.
- Benchmark Performance: Achieves a score of 31.5 on AlpacaEval 2.0, outperforming GPT4-0613 (30.2) in this metric. It also scores 8.7 on MT-Bench.
- Training Methodology: The training process involved an initial phase using Infinity-Instruct-3M to enhance foundational abilities (including math and code), followed by further fine-tuning with Infinity-Instruct-0613 to create a robust chat model.
- Efficiency: Leverages techniques from FlagScale to optimize training costs and accelerate the process.
Ideal Use Cases
- General Instruction Following: Suitable for applications requiring accurate and nuanced responses to diverse instructions.
- Chatbot Development: Its strong performance in chat-oriented benchmarks makes it a good candidate for conversational AI systems.
- Research in Instruction Tuning: Valuable for researchers exploring instruction tuning methods, especially those interested in models developed without RLHF.