Name: Surromind/RetrievalLLM-preview API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: Surromind

Surromind/RetrievalLLM-preview: RAG-Specialized Qwen2.5 Model

Surromind/RetrievalLLM-preview is a 14.8 billion parameter model built upon the Qwen2.5 architecture, specifically fine-tuned for Retrieval Augmented Generation (RAG) tasks. Its core strength lies in providing accurate answers and their corresponding sources from input documents, formatted as a structured JSON output.

Key Capabilities

Grounded Responses: Generates answers directly supported by provided documents.
Source Citation: Automatically includes doc_id and exact quote passages (source) for verification.
Structured Output: Delivers responses in a predefined JSON format, including related_document, source, answer (plain), and grounded_answer (with inline citations).
Specialized Training: Fine-tuned using a proprietary dataset combining RAG-specific data, Chain-of-Thought (CoT) examples, and various machine reading comprehension benchmarks (AIhub datasets).

Training Details

The model was trained on H100 GPUs (80GB * 8) with a tokenizer model max length of 4500 and a learning rate of 5e-06. Datasets included AIhub's administrative, news, book, table, numerical, and financial/legal machine reading comprehension data, alongside Korean CoT and instruction datasets.

Ideal Use Cases

This model is particularly well-suited for applications requiring high-precision information extraction and verifiable answers from a given corpus, such as enterprise knowledge bases, legal document analysis, or customer support systems where source attribution is critical.

Overview

Surromind/RetrievalLLM-preview: RAG-Specialized Qwen2.5 Model

Key Capabilities

Training Details

Ideal Use Cases

Full Model Card (README)