BAAI/Infinity-Instruct-3M-0625-Llama3-8B

Warm
Public
8B
FP8
8192
License: apache-2.0
Hugging Face
Overview

Overview

BAAI/Infinity-Instruct-3M-0625-Llama3-8B is an 8 billion parameter instruction-tuned model from the Beijing Academy of Artificial Intelligence (BAAI). It is built upon the Llama3-8B base model and fine-tuned using the extensive Infinity-Instruct-3M and Infinity-Instruct-0625 datasets. A key characteristic is its development without reinforcement learning from human feedback (RLHF), relying solely on supervised instruction tuning.

Key Capabilities & Performance

  • Instruction Following: Designed for general instruction-following tasks, leveraging a million-level instruction dataset.
  • Benchmark Performance: Achieves competitive results on standard evaluation benchmarks:
    • AlpacaEval 2.0: Scores 27.5, outperforming Llama-3-8B-Instruct (22.9) and Mixtral 8x7B v0.1 (23.7).
    • MT-Bench: Scores 8.2, comparable to Mixtral 8x7B v0.1 (8.3).
  • Training Process: The model was initially fine-tuned on Infinity-Instruct-3M to enhance foundational abilities (math & code), then further fine-tuned with Infinity-Instruct-0625 to create a stronger chat model.

Use Cases

This model is well-suited for applications requiring robust instruction following and conversational capabilities, particularly where a model trained without RLHF is preferred. Its performance on AlpacaEval 2.0 suggests strong alignment with human preferences in instruction-following scenarios.