gradientai/Llama-3-70B-Instruct-Gradient-262k
TEXT GENERATIONConcurrency Cost:4Model Size:70BQuant:FP8Ctx Length:8kPublished:May 3, 2024License:llama3Architecture:Transformer0.1K Warm
The gradientai/Llama-3-70B-Instruct-Gradient-262k model, developed by Gradient, is an instruction-tuned Llama 3 70B model with an extended context length of over 262,000 tokens, significantly surpassing the base model's 8k context. This extension is achieved through NTK-aware interpolation and progressive training on augmented data, making it highly suitable for applications requiring deep understanding and processing of very long documents and conversations. It demonstrates that state-of-the-art LLMs can adapt to long contexts with minimal additional training.
Loading preview...
Popular Sampler Settings
Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.
temperature
–
top_p
–
top_k
–
frequency_penalty
–
presence_penalty
–
repetition_penalty
–
min_p
–