BAAI/Infinity-Instruct-3M-0625-Llama3-70B

TEXT GENERATIONConcurrency Cost:4Model Size:70BQuant:FP8Ctx Length:8kPublished:Jul 9, 2024License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

Infinity-Instruct-3M-0625-Llama3-70B is a 70 billion parameter instruction-tuned language model developed by Beijing Academy of Artificial Intelligence (BAAI). This model is fine-tuned on the Infinity-Instruct-3M and Infinity-Instruct-0625 datasets, achieving favorable results on AlpacaEval 2.0 compared to GPT4-0613 without using reinforcement learning from human feedback (RLHF). It is designed for general instruction following tasks, demonstrating strong performance in areas like multi-turn questions, code, and math.

Loading preview...

Overview

Infinity-Instruct-3M-0625-Llama3-70B is a 70 billion parameter instruction-tuned model from Beijing Academy of Artificial Intelligence (BAAI). It is developed through supervised instruction tuning, specifically without the use of reinforcement learning from human feedback (RLHF). The model is fine-tuned on the proprietary Infinity-Instruct-3M and Infinity-Instruct-0625 datasets, which are million-level instruction datasets.

Key Capabilities & Performance

  • Instruction Following: Excels in general instruction following tasks, as evidenced by its performance on benchmarks.
  • Benchmark Results: Achieves a score of 38.0 on AlpacaEval 2.0, surpassing GPT4-0613 (30.2) and the official Llama-3-70B-Instruct (34.4) in this metric. It also scores 8.9 on MT-Bench.
  • Training Methodology: Utilizes a two-stage fine-tuning process: first, applying Infinity-Instruct-3M to enhance foundational abilities (math & code), then further fine-tuning with Infinity-Instruct-0625 for stronger chat capabilities.
  • Efficiency: Training leverages techniques from FlagScale to reduce padding tokens and accelerate the training procedure, optimizing costs.

When to Use This Model

  • Instruction Following Applications: Ideal for scenarios requiring robust instruction adherence and conversational abilities.
  • Research & Development: Suitable for academic research, particularly in exploring instruction tuning without RLHF.
  • Cost-Effective Solutions: Offers competitive performance against larger models like GPT-4 on certain benchmarks, potentially providing a more efficient alternative for specific use cases.