Surromind/RetrievalLLM-preview: RAG-Specialized Qwen2.5 Model
Surromind/RetrievalLLM-preview is a 14.8 billion parameter model built upon the Qwen2.5 architecture, specifically fine-tuned for Retrieval Augmented Generation (RAG) tasks. Its core strength lies in providing accurate answers and their corresponding sources from input documents, formatted as a structured JSON output.
Key Capabilities
- Grounded Responses: Generates answers directly supported by provided documents.
- Source Citation: Automatically includes
doc_id and exact quote passages (source) for verification. - Structured Output: Delivers responses in a predefined JSON format, including
related_document, source, answer (plain), and grounded_answer (with inline citations). - Specialized Training: Fine-tuned using a proprietary dataset combining RAG-specific data, Chain-of-Thought (CoT) examples, and various machine reading comprehension benchmarks (AIhub datasets).
Training Details
The model was trained on H100 GPUs (80GB * 8) with a tokenizer model max length of 4500 and a learning rate of 5e-06. Datasets included AIhub's administrative, news, book, table, numerical, and financial/legal machine reading comprehension data, alongside Korean CoT and instruction datasets.
Ideal Use Cases
This model is particularly well-suited for applications requiring high-precision information extraction and verifiable answers from a given corpus, such as enterprise knowledge bases, legal document analysis, or customer support systems where source attribution is critical.