allenai/tmax-9b

VISIONConcurrency Cost:1Model Size:9BQuant:FP8Ctx Length:32kTool Calling:SupportedPublished:Jun 17, 2026License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

allenai/tmax-9b is a 9 billion parameter language model developed by Ai2, fine-tuned from Qwen 3.5 9B using DPPO. It is specifically optimized as a terminal-agent, demonstrating improved performance on Terminal Bench 2.0, achieving approximately 27% after 200 steps of RL training. This model excels at executing commands and interacting within a terminal environment, making it suitable for automated system administration and development tasks.

Loading preview...

TMax 9B: A Specialized Terminal Agent

TMax 9B, developed by Ai2, is a 9 billion parameter model fine-tuned from Qwen 3.5 9B. Its primary distinction lies in its optimization as a terminal-agent, achieved through 200 steps of DPPO (Distributed Proximal Policy Optimization) training on the TMax-15k dataset.

Key Capabilities & Performance

  • Terminal Interaction: Designed to operate effectively within a terminal environment, capable of executing commands and responding to system outputs.
  • Enhanced Agent Performance: Achieves approximately 27% on Terminal Bench 2.0, representing a significant improvement of ~6 points over its base model, Qwen 3.5 9B, in terminal-based tasks.
  • DPPO Training: Utilizes DPPO for reinforcement learning, focusing on agentic capabilities rather than general language generation.
  • Part of a Model Collection: TMax 9B is one of several terminal agents released by allenai, offering various sizes for different computational needs.

Use Cases & Recommendations

  • Automated System Administration: Ideal for tasks requiring automated command execution and interaction with operating system shells.
  • Development & Testing: Can be employed in automated testing environments or for scripting complex development workflows.
  • Research in Agentic LLMs: Provides a strong baseline for further research into language models acting as autonomous agents in technical environments.

For detailed evaluation methodology and further insights, refer to the TMax paper and the codebase.