arcee-ai/Trinity-Large-Thinking
Hugging Face
TEXT GENERATIONConcurrency Cost:4Model Size:399BQuant:FP8Ctx Length:32kPublished:Apr 1, 2026License:apache-2.0Architecture:Transformer0.2K Open Weights Warm

Trinity-Large-Thinking by Arcee AI is a 398B-parameter sparse Mixture-of-Experts (MoE) model with 13B active parameters per token, specifically optimized for reasoning and agentic workflows. It is post-trained with extended chain-of-thought reasoning and agentic RL, achieving strong performance on agentic benchmarks like "τ²-Bench" (94.7%), "PinchBench" (91.9%), and "LiveCodeBench" (98.2%). This model generates explicit reasoning traces within ... blocks, which are crucial for multi-turn conversations and agentic loops, and supports a 512k context length after extension.

Loading preview...

Trinity-Large-Thinking: Reasoning-Optimized Agentic MoE

Trinity-Large-Thinking is a 398-billion parameter sparse Mixture-of-Experts (MoE) model from Arcee AI, featuring approximately 13 billion active parameters per token. It is a specialized variant of the Trinity-Large family, post-trained with extended chain-of-thought reasoning and agentic Reinforcement Learning (RL) to excel in complex, multi-step tasks.

Key Capabilities

  • Agentic-first design: Purpose-built for tool calling, multi-step planning, and sophisticated agent workflows.
  • Native Reasoning Traces: Generates explicit reasoning within <think>...</think> blocks, which must be preserved in context for effective multi-turn and agentic operations.
  • High Agentic Performance: Achieves 94.7% on τ²-Bench, 91.9% on PinchBench, and 98.2% on LiveCodeBench, demonstrating state-of-the-art results in agentic benchmarks.
  • Broad Compatibility: Works seamlessly with major agent frameworks like OpenClaw and Hermes Agent.
  • Extended Context Window: Supports a 512k context length, accommodating long reasoning chains across many agentic steps.

Good for

  • Developing AI agents requiring robust tool-calling and multi-step planning capabilities.
  • Applications that benefit from explicit, interpretable reasoning traces.
  • Complex, multi-turn conversational systems where maintaining chain-of-thought is critical.
  • Integration into existing agent frameworks like OpenClaw and Hermes Agent for enhanced performance.