microsoft/FastContext-1.0-4B-SFT
microsoft/FastContext-1.0-4B-SFT is a 4 billion parameter supervised fine-tuned repository-exploration subagent, built on the Qwen3-4B-Instruct backbone. Designed to be invoked by larger LLM coding agents, it efficiently locates relevant code by issuing parallel read-only tool calls (READ, GLOB, GREP) and returning compact file paths and line ranges. This model significantly reduces the main agent's token consumption and improves end-to-end resolution rates by offloading repository exploration tasks.
Loading preview...
FastContext-1.0-4B-SFT: Efficient Repository Explorer
FastContext-1.0-4B-SFT is a specialized 4 billion parameter subagent developed by Microsoft, designed to optimize the repository exploration phase for larger LLM coding agents. It acts as a dedicated explorer, allowing the main coding agent to focus on problem-solving by providing clean, grounded evidence rather than raw search results.
Key Capabilities & Features
- Dedicated Repository Exploration: Separates the task of finding relevant code from the main agent's problem-solving, reducing context pollution and token usage.
- Parallel Tool Calling: Executes
READ,GLOB, andGREPcommands in parallel to efficiently cover search hypotheses. - Compact Citations: Returns precise file paths and line ranges as focused context, minimizing the information load on the main agent.
- Performance Improvement: Integrates with main agents like GPT-5.4, GLM-5.1, and Kimi-K2.6, improving end-to-end resolution rates by up to 5.5% and reducing main-agent token consumption by up to 60%.
- Lightweight & Efficient: The 4B-SFT variant offers strong performance, sometimes outperforming larger 30B explorers, making it suitable for efficient deployment.
Training & Architecture
FastContext is built on the Qwen3-4B-Instruct backbone and trained in two stages: Supervised Fine-Tuning (SFT) on exploration traces (parallel tool calls, multi-turn trajectories, line range generation) and Reinforcement Learning (RL) using GRPO with a reward system for file/line-level F1 and parallel exploration.
Ideal Use Cases
- Enhancing LLM Coding Agents: Perfect for developers looking to improve the efficiency and accuracy of their LLM-powered coding assistants.
- Reducing Token Consumption: For scenarios where minimizing the main agent's token usage during code exploration is critical.
- Streamlining Code Search: When precise and relevant code snippets are needed quickly from large repositories.