alireza7/GrepSeek-Qwen3.5-9B-GRPO

VISIONConcurrency Cost:1Model Size:9BQuant:FP8Ctx Length:32kTool Calling:SupportedPublished:May 26, 2026License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

GrepSeek-Qwen3.5-9B-GRPO by alireza7 is a 9 billion parameter Qwen3.5-based model with a 32768 token context length, designed as a Direct Corpus Interaction (DCI) search agent. It answers questions by executing Unix shell commands directly against a raw 21M-passage Wikipedia corpus, interleaving retrieval and reasoning. This model is optimized for precise lexical search and compositional reasoning, particularly excelling in multi-hop question answering tasks without requiring an embedding index.

Loading preview...

GrepSeek-Qwen3.5-9B-GRPO: A Direct Corpus Interaction Search Agent

This model, developed by alireza7, is a 9 billion parameter Qwen3.5-based search agent optimized with GRPO. GrepSeek employs a novel Direct Corpus Interaction (DCI) approach, answering questions by issuing Unix shell commands (e.g., rg, grep) directly against a raw 21M-passage Wikipedia corpus. This method allows for interleaved retrieval and reasoning, bypassing traditional index-based retrieval.

Key Differentiators & Capabilities

  • Direct Corpus Interaction (DCI): Unlike dense or sparse index-based retrieval, GrepSeek operates directly on the raw corpus, preserving lexical precision and isolating exact entity names.
  • No Embedding Index Required: It eliminates the need for a pre-computed embedding index, relying only on the ~14 GB raw corpus.
  • Enhanced Controllability: The agent can enforce exact filters and iteratively refine search results using shell commands.
  • Compositional Reasoning: GrepSeek composes multi-stage retrieval programs for complex, multi-hop question answering.
  • Performance: Achieves a micro-average token-F1 of 0.5691 and EM of 0.4948 on the Search-R1 suite, outperforming baselines, especially on multi-hop tasks like HotpotQA, 2Wiki, and MuSiQue.
  • Accelerated Search: The inference harness includes a semantics-preserving sharded-parallel execution engine that accelerates corpus search by up to 7.6x.

When to Use This Model

  • Complex Question Answering: Ideal for tasks requiring precise lexical matching, exact entity disambiguation, and iterative evidence aggregation, particularly multi-hop questions.
  • Tool-Using Agent Scenarios: This model is specifically designed as a tool-using agent that emits <tool_call> shell commands, requiring execution against a corpus and return of <tool_response> turns. It is not a standalone chatbot.
  • Lexical Precision: When fine-grained entity and lexical distinctions are critical, and semantic smoothing is undesirable.

Limitations

  • Surface-Form Variation: Weaker on queries with significant surface-form variation or long-tail characteristics (e.g., diacritics, name variants) due to its purely lexical retrieval.
  • No Semantic Relevance Ranking: grep does not offer semantic relevance ranking, meaning authoritative passages might be buried behind earlier file-order matches.