Spreadsheet-RL/Spreadsheet-RL-4B

TEXT GENERATIONConcurrency Cost:1Model Size:4BQuant:BF16Ctx Length:32kTool Calling:SupportedPublished:May 23, 2026License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

Spreadsheet-RL-4B is a 4 billion parameter language model developed by Banghao Chi et al., based on Qwen3-4B-Thinking-2507. This model is post-trained using reinforcement learning (GRPO) within Spreadsheet Gym, a Microsoft Excel environment, to excel at realistic spreadsheet tasks. It integrates spreadsheet-native tools, sandboxed code execution, and Excel-based recalculation rewards. Spreadsheet-RL-4B is specifically designed to function as an RL-trained agent for complex spreadsheet automation and problem-solving.

Loading preview...

Overview

Spreadsheet-RL-4B is a 4 billion parameter model, originating from Qwen/Qwen3-4B-Thinking-2507, that has been significantly enhanced through reinforcement learning (RL). Developed by Banghao Chi et al., this model is specifically designed to operate as an agent within a realistic Microsoft Excel environment called Spreadsheet Gym. Its training leverages outcome-based RL (GRPO) on a dataset of 5,928 filtered ExcelForum tasks, enabling it to interact with spreadsheet-native tools and execute sandboxed code.

Key Capabilities

  • Reinforcement Learning Agent: Functions as an RL-trained agent for complex spreadsheet tasks.
  • Spreadsheet-Native Tool Integration: Utilizes a comprehensive set of tools within a Microsoft Excel 365 environment.
  • Sandboxed Code Execution: Supports secure execution of code within the spreadsheet context.
  • Performance Improvement: Demonstrates improved performance on spreadsheet benchmarks, achieving 23.4 Pass@1 on SpreadsheetBench and 17.2 Pass@1 on Domain-Spreadsheet, significantly outperforming its base model and intermediate configurations.

When to Use This Model

  • Automating Complex Spreadsheet Tasks: Ideal for scenarios requiring an AI agent to solve multi-turn problems in Microsoft Excel.
  • Research in RL Agents for Structured Data: Useful for researchers exploring reinforcement learning applications in environments with structured data and tool use.
  • Developing Spreadsheet-Based AI Solutions: Best utilized with the full Spreadsheet-RL agent harness and Spreadsheet Gym environment to reproduce its intended capabilities and performance.