microsoft/FastContext-1.0-4B-RL

TEXT GENERATIONConcurrency Cost:1Model Size:4BQuant:BF16Ctx Length:32kTool Calling:SupportedPublished:Jun 14, 2026License:mitArchitecture:Transformer0.0K Open Weights Cold

microsoft/FastContext-1.0-4B-RL is a 4 billion parameter repository-exploration subagent, built on the Qwen3-4B-Instruct backbone, designed to enhance LLM coding agents. This model specializes in efficiently locating relevant code by issuing parallel read-only tool calls (READ, GLOB, GREP) and returning compact file paths and line ranges. It significantly reduces the main agent's token consumption and improves end-to-end resolution rates by offloading repository exploration tasks.

Loading preview...

FastContext-1.0-4B-RL: An Efficient Repository Explorer for Coding Agents

FastContext-1.0-4B-RL is a 4 billion parameter subagent developed by Microsoft, specifically engineered to optimize repository exploration for larger LLM coding agents. It acts as a dedicated explorer, invoked on demand by a main agent, to perform parallel read-only operations (READ, GLOB, GREP) and return precise file and line citations. This approach addresses a major bottleneck in coding agents, where repository exploration can consume a significant portion of token budgets and pollute the main agent's context.

Key Capabilities & Features

  • Dedicated Repository Exploration: Separates the task of code exploration from the main coding agent, allowing the main agent to focus on problem-solving.
  • Parallel Tool Calling: Can issue multiple READ, GLOB, and GREP calls simultaneously to efficiently cover search hypotheses.
  • Context Optimization: Returns compact file paths and line ranges, providing clean, grounded evidence to the main agent and reducing its token consumption.
  • Performance Gains: Integrates with main agents like GPT-5.4, GLM-5.1, and Kimi-K2.6, improving end-to-end resolution rates by up to 5.5% and reducing main-agent token consumption by up to 60.3% (e.g., GPT-5.4 on SWE-QA).
  • Reinforcement Learning (RL) Refinement: Bootstrapped from strong reference-model trajectories via SFT and further refined with task-grounded RL for broad first-turn search, multi-turn evidence gathering, and precise citation generation.
  • Lightweight & Efficient: The 4B-RL variant can outperform the larger 30B-SFT explorer in certain scenarios while using fewer tokens, demonstrating its efficiency.

Good For

  • Enhancing LLM Coding Agents: Ideal for developers looking to improve the efficiency and performance of their LLM-based coding assistants.
  • Reducing Token Usage: Significantly cuts down on the token budget spent by main agents on repository exploration.
  • Improving Code Resolution Rates: Boosts the accuracy of coding agents in solving tasks by providing more focused and relevant context.
  • Complex Codebase Navigation: Excels at navigating and extracting specific information from large code repositories.