Name: maximalists/BRAG-Llama-3.1-8b-v0.1 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: maximalists

BRAG-Llama-3.1-8b-v0.1: RAG-Optimized SLM

BRAG-Llama-3.1-8b-v0.1 is an 8 billion parameter Small Language Model (SLM) from the maximalists' BRAG series, designed specifically for Retrieval-Augmented Generation (RAG) tasks. It supports an extended context length of up to 128k tokens, making it suitable for processing substantial amounts of information.

Key Capabilities

RAG with Diverse Data: Proficient in RAG tasks involving both structured (tables) and unstructured (text) data.
Conversational RAG: Optimized for integrating RAG into conversational chat applications.
Extended Context: Features a 128k token context window, allowing for comprehensive information retrieval and generation.
English-Centric: Primarily trained and evaluated for English, leveraging the base model's multilingual foundation.

Performance Highlights

The model demonstrates competitive performance on the ChatRAG-Bench, scoring 52.29. This places it favorably against other SLMs and even some larger LLMs in RAG-specific evaluations. For instance, it outperforms BRAG-Llama-3-8b-v0.1 (51.70) and is close to BRAG-Qwen2-7b-v0.1 (53.23) and GPT-4-Turbo (54.03) on this benchmark.

Limitations

Users should note that the model is specifically trained for short contexts and may not perform optimally with very long inputs. Adhering to the recommended system prompt is crucial to prevent underperformance and potential hallucinations.