orbit-ai/searchr1-repro-4b
TEXT GENERATIONConcurrency Cost:1Model Size:4BQuant:BF16Ctx Length:32kPublished:Aug 27, 2025License:apache-2.0Architecture:Transformer Open Weights Cold

The orbit-ai/searchr1-repro-4b is a 4 billion parameter open search agent developed by orbit-ai, based on the Qwen3-4B architecture. It is fine-tuned using GRPO for multi-turn question answering with web search as a tool, trained on Natural Questions (NQ) and HotpotQA datasets. This model excels at retrieval-augmented reasoning, enabling it to answer complex questions by dynamically searching the web.

Loading preview...

What is orbit-ai/searchr1-repro-4b?

This model is a 4-billion parameter open search agent, a reproduction of the Search-R1 model, developed by orbit-ai. It leverages the Qwen3-4B as its base architecture and is fine-tuned using the GRPO (Gradient-based Reinforcement Learning with Policy Optimization) algorithm. Its primary function is to act as a multi-turn question-answering agent that can utilize web search as a tool.

Key Capabilities & Training:

  • Tool-use: Designed to integrate live web search (via DDGS-based retriever) for information gathering.
  • Multi-turn QA: Optimized for complex, multi-turn question answering, including single-hop factoid and 2-hop reasoning tasks.
  • Training Data: Trained on a mix of Natural Questions (NQ) and HotpotQA datasets for 200 GRPO steps.
  • Architecture: Built on the Qwen3-4B base, making it a relatively small yet capable model for its specialized task.

When to Use This Model:

  • Research: Ideal for research into multi-turn retrieval-augmented reasoning and RL-based tool-use training.
  • Retrieval-Augmented Generation (RAG): Suitable for applications where dynamic information retrieval is crucial for accurate answers.
  • English-only: Currently supports English language tasks.

Note: For optimal performance, this model requires a live web search backend. Without it, it relies solely on its parametric knowledge, which may degrade accuracy for fine-grained entity questions.