astronomer/Llama-3-70B-Special-Tokens-Adjusted
TEXT GENERATIONConcurrency Cost:4Model Size:70BQuant:FP8Ctx Length:8kPublished:Jul 23, 2024License:llama-3Architecture:Transformer0.0K Warm

The astronomer/Llama-3-70B-Special-Tokens-Adjusted is a 70 billion parameter Llama 3 model, developed by Astronomer, with its input and output embedding weights adjusted for previously untrained special tokens. This modification addresses potential fine-tuning instabilities like gradient explosions or NaN gradients, making it an ideal and stable base for further fine-tuning. It maintains the original Llama 3 architecture and a context length of 8192 tokens, specifically optimized to prevent issues when adding custom tokens or utilizing existing special tokens.

Loading preview...

Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p