nvidia/Llama3-ChatQA-1.5-8B
TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:8kPublished:Apr 28, 2024License:llama3Architecture:Transformer0.6K Warm

NVIDIA's Llama3-ChatQA-1.5-8B is an 8 billion parameter conversational language model built on the Llama-3 base architecture, specifically optimized for conversational question answering (QA) and retrieval-augmented generation (RAG). Developed using an improved training recipe from the ChatQA paper, it enhances tabular and arithmetic calculation capabilities. This model is particularly strong in scenarios where context is provided, such as over documents or retrieved information, and has a context length of 8192 tokens.

Loading preview...