gradientai/Llama-3-8B-Instruct-Gradient-1048k
TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:8kPublished:Apr 29, 2024License:llama3Architecture:Transformer0.7K Warm

Gradient's Llama-3-8B-Instruct-Gradient-1048k is an 8 billion parameter instruction-tuned Llama 3 model developed by Gradient. It significantly extends the base Llama 3 8B's context window from 8K to over 1 million tokens (1048K) through RoPE theta optimization and progressive training. This model excels in long-context retrieval and Q&A tasks, demonstrating strong performance among smaller parameter models.

Loading preview...

Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p