yaoyueduzhen/RAG-R1-mq-7b

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:7.6BQuant:FP8Ctx Length:32kPublished:Jul 3, 2025License:apache-2.0Architecture:Transformer0.0K Open Weights Warm

RAG-R1-mq-7b is a RAG (Retrieval Augmented Generation) model developed by Zhiwen Tan, Jiaming Huang, Qintong Wu, Hongxuan Zhang, Chenyi Zhuang, and Jinjie Gu. This framework enhances LLMs by enabling adaptive use of internal and external knowledge through multi-query parallelism. It is designed to reduce inference time and improve reasoning capabilities, outperforming baselines by up to 13.2% on QA benchmarks while decreasing inference time by 11.1%.

Loading preview...

Overview of RAG-R1-mq-7b

RAG-R1-mq-7b is a Retrieval Augmented Generation (RAG) model developed by a team including Zhiwen Tan and Jiaming Huang. It introduces a novel deepsearch training framework that allows Large Language Models (LLMs) to adaptively integrate both internal and external knowledge during their reasoning processes. A key innovation of this framework is the expansion of the generation and retrieval processes from a single-query mode to multi-query parallelism.

Key Capabilities & Differentiators

  • Adaptive Knowledge Leverage: Enables LLMs to dynamically access and utilize relevant information from various sources.
  • Multi-Query Parallelism: Significantly reduces inference time by processing multiple queries in parallel during retrieval and generation.
  • Enhanced Reasoning: Improves the model's ability to reason by providing more comprehensive and timely access to information.
  • Performance Gains: Demonstrates superior performance on question-answering benchmarks, outperforming strong baselines by up to 13.2% in accuracy.
  • Efficiency Improvements: Achieves a notable reduction in inference time, decreasing it by 11.1% compared to previous methods.

Use Cases

This model is particularly well-suited for applications requiring:

  • Question Answering (QA): Excels in scenarios where accurate and contextually rich answers are critical.
  • Information Retrieval: Ideal for systems that need to efficiently search and synthesize information from large knowledge bases.
  • Reasoning Tasks: Benefits tasks that demand complex logical inference and the integration of diverse data points.

Inspired by projects like Deepseek-R1, RAG-R1-mq-7b offers a robust solution for developers looking to enhance the knowledge-grounded capabilities and efficiency of their LLM applications.