The dganochenko/llama-3-8b-chat model is an 8 billion parameter instruction-tuned generative text model developed by Meta, part of the Llama 3 family. Optimized for dialogue use cases, it utilizes an optimized transformer architecture with Grouped-Query Attention (GQA) for improved inference scalability. This model excels in assistant-like chat applications and outperforms many open-source chat models on common industry benchmarks, trained on over 15 trillion tokens of publicly available data with an 8k context length.
No reviews yet. Be the first to review!