allenai/tmax-27b
allenai/tmax-27b is a 27 billion parameter language model developed by Ai2, fine-tuned using DPPO on a Qwen 3.6 27B base for use as a terminal agent. It specializes in executing commands and interacting with terminal environments, achieving approximately 43% on Terminal Bench 2.0. This model is optimized for automated task execution within command-line interfaces, offering enhanced performance over its base model in terminal-specific benchmarks.
Loading preview...
TMax 27B: A Specialized Terminal Agent
TMax 27B, developed by Ai2, is a 27 billion parameter model specifically designed and fine-tuned for terminal agent applications. Built upon the Qwen 3.6 27B base model, it leverages Deep Proximal Policy Optimization (DPPO) to excel in command-line environments.
Key Capabilities & Performance
- Terminal Task Execution: TMax 27B is optimized for interacting with terminal environments, making it suitable for automating system administration, development workflows, and other command-line tasks.
- RL Fine-tuning: The model underwent 160 steps of RL training using DPPO, significantly improving its performance on terminal-specific benchmarks.
- Benchmark Improvement: It achieves approximately 44.9% on Terminal Bench 2.1 and 42.7% on Terminal Bench 2.0, outperforming its Qwen 3.6 27B base model in these metrics.
- Context Length: Supports a maximum overall context length of 65536 tokens, allowing for complex multi-turn interactions within the terminal.
When to Use TMax 27B
- Automated Scripting: Ideal for scenarios requiring an AI to execute commands, navigate file systems, and perform operations within a terminal.
- Developer Tools: Can be integrated into development environments for automated testing, deployment, or code management tasks.
- Research in Agentic AI: A strong candidate for research into autonomous agents operating in command-line interfaces.
This model is part of a larger collection of TMax terminal agents and is licensed under Apache 2.0. For detailed evaluation methodology and training specifics, refer to the Tmax paper and the codebase.