Trinity-Large-Thinking: An Agentic MoE Model

Trinity-Large-Thinking is Arcee AI's 398 billion parameter sparse Mixture-of-Experts (MoE) model, featuring approximately 13 billion active parameters per token. It is a reasoning-optimized variant of the Trinity-Large family, post-trained with extended chain-of-thought reasoning and agentic Reinforcement Learning (RL).

Key Capabilities & Features

Agentic-first design: Specifically engineered for tool calling, multi-step planning, and complex agent workflows.
Native Reasoning Traces: Generates explicit chain-of-thought within <think>...</think> blocks, which are crucial for maintaining context in multi-turn conversations and agentic loops.
High Agentic Performance: Achieves 94.7% on τ²-Bench, 91.9% on PinchBench, and 98.2% on LiveCodeBench, demonstrating strong capabilities in agentic tasks.
Extensive Context Window: Features a 512k extended context window to accommodate long reasoning chains across many agentic steps.
Compatibility: Works out-of-the-box with major agent frameworks like OpenClaw and Hermes Agent.

Usage Considerations

For optimal performance, especially in multi-turn conversations and agentic loops, it is critical to preserve the model's reasoning_content (the content within <think>...</think> blocks) in the message history. Omitting this can degrade multi-step performance. The model is available via vLLM, Transformers, and OpenRouter.

Architecture

Built on a sparse MoE architecture with 256 experts (4 active), it was pretrained on 17 trillion tokens and post-trained with instruction tuning and agentic RL.

Overview

Trinity-Large-Thinking: An Agentic MoE Model

Key Capabilities & Features

Usage Considerations

Architecture

Full Model Card (README)