Name: allenai/tmax-2b API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: allenai

TMax 2B: A Specialized Terminal Agent

TMax 2B, developed by AllenAI, is a 2.3 billion parameter model specifically fine-tuned from Qwen 3.5 2B using Deep Proximal Policy Optimization (DPPO) to function as a terminal agent. This model is designed to interact with and execute commands within a terminal environment, making it highly suitable for automated tasks and agentic applications.

Key Capabilities & Performance

Terminal Agent Specialization: TMax 2B is explicitly trained for terminal-based interactions, demonstrating improved performance on relevant benchmarks.
Enhanced Benchmark Scores: It significantly outperforms its base model, Qwen 3.5 2B, on the Terminal Bench (TB) Lite, TB 2.1, and TB 2.0 (daytona) evaluations. For instance, TMax 2B achieves 11.8 +/- 1.4 on TB Lite compared to Qwen 3.5 2B's 5.71 +/- 1.6.
DPPO Fine-tuning: The model leverages DPPO for reinforcement learning, with the main checkpoint being from 100 steps of RL training, which showed optimal performance on TBLite.
Context Length: Supports a maximum overall token length of 65536, with a max per-turn token limit of 16384.

Use Cases & Considerations

Automated Scripting: Ideal for scenarios requiring an AI to understand and execute terminal commands.
Agentic Workflows: Can be integrated into systems that need an agent to interact with operating system shells or command-line interfaces.
Research & Development: Useful for researchers exploring reinforcement learning for agent control in terminal environments. The model's training details, including hyperparameters and dataset (TMax-15k), are openly provided.
No Vision Head: The vision head was removed during training, so it is intended for language-model-only use.

For more in-depth technical details, refer to the TMax paper.

Overview

TMax 2B: A Specialized Terminal Agent

Key Capabilities & Performance

Use Cases & Considerations

Full Model Card (README)