Overview
Overview
BAAI/Infinity-Instruct-3M-0625-Llama3-8B is an 8 billion parameter instruction-tuned model from the Beijing Academy of Artificial Intelligence (BAAI). It is built upon the Llama3-8B base model and fine-tuned using the extensive Infinity-Instruct-3M and Infinity-Instruct-0625 datasets. A key characteristic is its development without reinforcement learning from human feedback (RLHF), relying solely on supervised instruction tuning.
Key Capabilities & Performance
- Instruction Following: Designed for general instruction-following tasks, leveraging a million-level instruction dataset.
- Benchmark Performance: Achieves competitive results on standard evaluation benchmarks:
- AlpacaEval 2.0: Scores 27.5, outperforming Llama-3-8B-Instruct (22.9) and Mixtral 8x7B v0.1 (23.7).
- MT-Bench: Scores 8.2, comparable to Mixtral 8x7B v0.1 (8.3).
- Training Process: The model was initially fine-tuned on Infinity-Instruct-3M to enhance foundational abilities (math & code), then further fine-tuned with Infinity-Instruct-0625 to create a stronger chat model.
Use Cases
This model is well-suited for applications requiring robust instruction following and conversational capabilities, particularly where a model trained without RLHF is preferred. Its performance on AlpacaEval 2.0 suggests strong alignment with human preferences in instruction-following scenarios.