II-Search-CIR-4B: Code-Integrated Reasoning for Enhanced Search

II-Search-CIR-4B is a 4-billion parameter model from Intelligent Internet, built upon the Qwen3-4B architecture. It significantly advances tool interaction through its unique Code-Integrated Reasoning (CIR) methodology. Unlike traditional tool-calling paradigms, CIR enables the model to generate and execute Python code blocks, allowing for programmatic interaction with external resources.

Key Capabilities:

Code-Integrated Reasoning (CIR): The model generates code blocks (e.g., web_search, web_visit) to interact with external tools, process information, and reason programmatically.
Enhanced Information Seeking: Specifically fine-tuned to excel in tasks requiring external information retrieval and processing.
Improved Performance: Outperforms its base model (Qwen3-4B) and other small-sized search-specialized models like Jan-4B and WebSailor-3B on benchmarks such as Google Frames and Seal_0.

Training Methodology:

The model underwent a two-stage training process:

SFT Fine-tuning: Initial Supervised Fine-Tuning on a curated dataset to efficiently produce the required code format.
DAPO Optimization: Further optimized using DAPO (Direct Advantage Policy Optimization) on a hard-reasoning dataset to boost performance.

Good for:

Applications requiring advanced web search and information synthesis.
Developing agents that need to programmatically interact with external APIs or data sources.
Tasks where reasoning over retrieved information is critical, beyond simple retrieval.

For more details on the training methodology and datasets, refer to the II-Search-4B blog post and the released datasets: II-Search-CIR-SFT and II-Search-RL.

Overview

II-Search-CIR-4B: Code-Integrated Reasoning for Enhanced Search

Key Capabilities:

Training Methodology:

Good for:

Full Model Card (README)