Name: nvidia/Llama-3.3-Nemotron-70B-Edit API
Brand: Featherless.ai
Price: 25.00 USD
Availability: InStock
Author: nvidia

Model Overview

nvidia/Llama-3.3-Nemotron-70B-Edit is a 70 billion parameter large language model from NVIDIA, founded on Meta-Llama-3.3-70B-Instruct. This model is uniquely fine-tuned using Supervised Finetuning and Reinforcement Learning to edit LLM-generated responses based on explicit feedback, enhancing the helpfulness of outputs. It is a key component of the Feedback-Edit Inference Time Scaling (ITS) system, which has demonstrated leading performance on the Arena Hard LeaderBoard, achieving 93.4% when combined with Llama-3.3-Nemotron-Super-49B-v1.

Key Capabilities

Feedback-driven Response Editing: Specializes in refining and improving LLM responses by incorporating user feedback.
Inference-Time Scaling: Designed to be part of a system that leverages feedback for performance improvement during inference.
Commercial Use: Licensed for commercial applications under the NVIDIA Open Model License and Llama 3.3 Community License.
General-Domain Tasks: Optimized for open-ended tasks where iterative refinement of responses is beneficial.

Use Cases

Improving LLM Output Quality: Ideal for scenarios where initial LLM responses need to be iteratively improved based on human or automated feedback.
Interactive AI Systems: Suitable for applications requiring dynamic adjustment of generated text to better meet user expectations.
Research in Feedback Mechanisms: Can be used to explore and develop advanced methods for integrating feedback into LLM workflows.

This model supports a maximum input context of 128k tokens and generates outputs up to 4k tokens. It was trained on the HelpSteer3 dataset, which includes prompt-responses annotated with free-text feedback and edited responses.

Overview

Model Overview

Key Capabilities

Use Cases

Full Model Card (README)