trinhvanhung/Meta-Llama-3.1-8B-Instruct-Q4_K_M
TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:32kPublished:Dec 22, 2024License:llama3.1Architecture:Transformer Cold

The Meta-Llama-3.1-8B-Instruct-Q4_K_M is an 8 billion parameter instruction-tuned generative language model developed by Meta, part of the Llama 3.1 collection. It features an optimized transformer architecture with Grouped-Query Attention and a 128k context length, trained on over 15 trillion tokens with a December 2023 knowledge cutoff. This model is specifically optimized for multilingual dialogue use cases, outperforming many open-source and closed chat models on common industry benchmarks, and supports advanced tool use capabilities.

Loading preview...

Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p