nvidia/Llama-3.1-Nemotron-70B-Instruct-HF

Warm
Public
70B
FP8
32768
Oct 12, 2024
License: llama3.1
Hugging Face
Overview

Model Overview

NVIDIA's Llama-3.1-Nemotron-70B-Instruct-HF is a 70 billion parameter instruction-tuned large language model built upon the Llama 3.1 architecture, featuring a 32768 token context window. This model is specifically customized by NVIDIA to significantly improve the helpfulness and quality of LLM-generated responses to user queries. It was trained using REINFORCE, a Reinforcement Learning from Human Feedback (RLHF) method, leveraging the Llama-3.1-Nemotron-70B-Reward model and the HelpSteer2-Preference dataset.

Key Capabilities & Performance

  • Enhanced Helpfulness: Customized to provide more helpful, factually correct, coherent, and customizable responses.
  • Leading Alignment Benchmarks: As of October 1, 2024, it ranks #1 on several automatic alignment benchmarks, including Arena Hard (85.0), AlpacaEval 2 LC (57.6), and GPT-4-Turbo MT-Bench (8.98), outperforming models like GPT-4o and Claude 3.5 Sonnet.
  • Robust Instruction Following: Demonstrates strong general-domain instruction following, capable of accurately answering complex questions without specialized prompting.

Use Cases & Considerations

  • General-Domain Instruction Following: Ideal for applications requiring highly helpful and accurate responses across a broad range of topics.
  • Research and Development: Useful for exploring advanced RLHF techniques and model alignment strategies.
  • Hardware Requirements: Requires significant computational resources, specifically 2 or more 80GB NVIDIA Ampere (or newer) GPUs and at least 150GB of disk space for deployment with HuggingFace Transformers.

This model is a demonstration of NVIDIA's techniques for improving helpfulness in general-domain instruction following, though it has not been specifically tuned for specialized domains like mathematics.