Overview

Jupyter Agent Qwen3-4B Thinking is a specialized causal language model, fine-tuned from Qwen3-4B-Thinking-2507, designed for data science agentic tasks within Jupyter notebooks. It integrates seamlessly into notebook environments, enabling code execution with popular data science libraries like pandas, numpy, and matplotlib. A core feature is its ability to provide step-by-step reasoning with intermediate computations and thinking traces, crucial for understanding complex data analysis workflows.

Key Capabilities

Jupyter-native agent: Operates directly within notebook environments.
Code execution: Capable of running Python code for data analysis.
Step-by-step reasoning: Generates detailed thought processes and intermediate results.
Dataset-grounded analysis: Trained on real Kaggle notebook workflows for practical application.
Tool calling: Supports structured code execution and final answer generation.

Performance & Training

This model shows significant improvement over its base model on the DABStep benchmark for data science tasks, achieving 70.8% on easy tasks compared to the base model's 44.0%. It was fine-tuned using full-parameter training on the Jupyter Agent Dataset, comprising 51,389 synthetic notebooks with dataset-grounded QA pairs and executable reasoning traces. The training utilized a context length of 32,768 tokens.

Use Cases

This model is ideal for developers and data scientists needing an intelligent agent to automate and assist with data analysis, code generation, and problem-solving directly within Jupyter notebooks. It's particularly effective for tasks requiring detailed reasoning and code execution in a sandboxed environment.

Overview

Overview

Key Capabilities

Performance & Training

Use Cases

Full Model Card (README)