gradientai/Llama-3-70B-Instruct-Gradient-262k
TEXT GENERATIONConcurrency Cost:4Model Size:70BQuant:FP8Ctx Length:8kPublished:May 3, 2024License:llama3Architecture:Transformer0.1K Warm

The gradientai/Llama-3-70B-Instruct-Gradient-262k model, developed by Gradient, is an instruction-tuned Llama 3 70B model with an extended context length of over 262,000 tokens, significantly surpassing the base model's 8k context. This extension is achieved through NTK-aware interpolation and progressive training on augmented data, making it highly suitable for applications requiring deep understanding and processing of very long documents and conversations. It demonstrates that state-of-the-art LLMs can adapt to long contexts with minimal additional training.

Loading preview...

Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p