Seungyoun/qwen2.5-3b-it_searchR1-like-multiturn is a 3.1 billion parameter instruction-tuned causal language model based on Qwen 2.5. Developed by Seungyoun Shin, this model is a faithful re-implementation of the Search-R1 model, specifically optimized for multi-turn tool-calling and enhanced question answering through web search integration. It excels at complex reasoning tasks requiring external knowledge retrieval, demonstrating improved performance on various QA datasets.
Loading preview...
Model Overview
Seungyoun/qwen2.5-3b-it_searchR1-like-multiturn is a 3.1 billion parameter instruction-tuned model built upon the Qwen 2.5-3B-instruct architecture. Developed by Seungyoun Shin, this model is a re-implementation of the Search-R1 approach, specifically designed for multi-turn tool-calling capabilities. It was trained using GRPO via the open-source VERL framework on the nq-hotpotqa-train dataset.
Key Capabilities
- Integrated Web Search: The model can autonomously call a
searchfunction (e.g., DuckDuckGo) to retrieve external information, parsing tool calls and integrating search results into its reasoning process. - Multi-turn Reasoning: It is designed to handle complex, multi-turn interactions, performing chain-of-thought reasoning within
<think>blocks before generating answers or making tool calls. - Enhanced Question Answering: The model demonstrates improved performance on several question-answering datasets, including NQ, TriviaQA, PopQA, HotpotQA, 2Wiki, and Bamboogle, often surpassing the original Search-R1 implementation.
- Structured Output: It follows a specific reasoning style, emitting JSON-formatted tool calls and providing concise answers within
<answer>blocks.
Ideal Use Cases
This model is particularly well-suited for applications requiring:
- Complex Question Answering: Answering questions that necessitate external knowledge retrieval and multi-step reasoning.
- Tool-Augmented LLM Applications: Scenarios where an LLM needs to interact with external tools or APIs to gather information.
- Research and Development: As a strong baseline or starting point for further research into tool-calling, reasoning, and knowledge-augmented language models, especially within the Qwen 2.5 ecosystem.