alireza7/GrepSeek-Qwen3.5-9B-SFT

VISIONConcurrency Cost:1Model Size:9BQuant:FP8Ctx Length:32kTool Calling:SupportedPublished:May 26, 2026License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

GrepSeek-Qwen3.5-9B-SFT by alireza7 is a 9 billion parameter Qwen3.5-based language model, supervised-fine-tuned for Direct Corpus Interaction (DCI) as a search agent. It is designed to answer questions by issuing Unix shell commands like 'rg' and 'grep' directly over a 21M-passage Wikipedia corpus. This model serves as a cold-start initialization for an RL stage, establishing low-level retrieval primitives for shell-based search behavior.

Loading preview...

GrepSeek-Qwen3.5-9B-SFT: A Direct Corpus Interaction Search Agent

This model, developed by alireza7, is a 9 billion parameter Qwen3.5-based language model specifically supervised-fine-tuned (SFT) for Direct Corpus Interaction (DCI). It functions as a search agent, designed to answer questions by executing Unix shell commands (e.g., rg, grep, head) directly against a 21 million-passage Wikipedia corpus, rather than relying on traditional dense or sparse indexing.

Key Capabilities & Purpose

  • Shell Command Generation: Emits <tool_call> shell commands for direct corpus interaction.
  • Cold-Start SFT Policy: Serves as the initial SFT stage for the GrepSeek project, instilling concise, causally-grounded shell-search behaviors and low-level retrieval primitives (e.g., fixed-string matching, truncation, cascaded filtering).
  • Foundation for RL: This SFT-only checkpoint is the initialization for a subsequent Reinforcement Learning (RL) stage, which further refines its search capabilities.
  • Improved Performance: Even in its SFT-only form, this model significantly outperforms the untuned base model on micro-averaged F1 and EM scores across 7 QA benchmarks (0.4249 F1, 0.3569 EM vs. 0.3314 F1, 0.2836 EM for base).

Important Considerations

  • Tool-Using Agent: This is not a standalone chatbot. It requires an external setup to function, including the PeterJinGo/wiki-18-corpus, a tool-calling vLLM server, and the GrepSeek inference harness (available in the code repo).
  • Context Length: Supports a context length of 32,768 tokens.
  • Successor Model: For enhanced multi-hop reasoning, consider its RL-optimized successor: alireza7/GrepSeek-Qwen3.5-9B-GRPO.