Name: gradientai/Llama-3-70B-Instruct-Gradient-1048k API
Brand: Featherless.ai
Price: 25.00 USD
Availability: InStock
Author: gradientai

Overview

This model, developed by Gradient, is an extended context version of the Meta Llama 3 70B Instruct model. Its primary innovation is the significant increase in context window from the original 8,192 tokens to over 1,048,000 tokens. This was achieved through techniques such as NTK-aware interpolation for RoPE theta adjustment and progressive training on increasing context lengths, similar to the Large World Model approach.

Key Capabilities

Massive Context Window: Processes and understands information across extremely long sequences, exceeding 1 million tokens.
Llama 3 70B Foundation: Inherits the strong performance and instruction-following capabilities of the base Llama 3 70B Instruct model.
Efficient Training: Achieved long-context capabilities with minimal additional training data (less than 0.003% of Llama 3's original pre-training data).

Good For

Applications requiring deep analysis of very long documents or codebases.
Complex conversational agents needing extensive memory and context retention.
Tasks involving summarization, question-answering, or generation over large bodies of text.

Overview

Overview

Key Capabilities

Good For

Full Model Card (README)