arcee-ai/Trinity-Large-Thinking
Trinity-Large-Thinking is a 398 billion parameter sparse Mixture-of-Experts (MoE) model by Arcee AI, with approximately 13 billion active parameters per token. This reasoning-optimized variant is post-trained with extended chain-of-thought and agentic RL, generating explicit reasoning traces in ... blocks. It delivers state-of-the-art performance on agentic benchmarks and is purpose-built for tool calling, multi-step planning, and agent workflows.
Loading preview...
Trinity-Large-Thinking: An Agentic MoE Model
Trinity-Large-Thinking is Arcee AI's 398 billion parameter sparse Mixture-of-Experts (MoE) model, featuring approximately 13 billion active parameters per token. It is a reasoning-optimized variant of the Trinity-Large family, post-trained with extended chain-of-thought reasoning and agentic Reinforcement Learning (RL).
Key Capabilities & Features
- Agentic-first design: Specifically engineered for tool calling, multi-step planning, and complex agent workflows.
- Native Reasoning Traces: Generates explicit chain-of-thought within
<think>...</think>blocks, which are crucial for maintaining context in multi-turn conversations and agentic loops. - High Agentic Performance: Achieves 94.7% on τ²-Bench, 91.9% on PinchBench, and 98.2% on LiveCodeBench, demonstrating strong capabilities in agentic tasks.
- Extensive Context Window: Features a 512k extended context window to accommodate long reasoning chains across many agentic steps.
- Compatibility: Works out-of-the-box with major agent frameworks like OpenClaw and Hermes Agent.
Usage Considerations
For optimal performance, especially in multi-turn conversations and agentic loops, it is critical to preserve the model's reasoning_content (the content within <think>...</think> blocks) in the message history. Omitting this can degrade multi-step performance. The model is available via vLLM, Transformers, and OpenRouter.
Architecture
Built on a sparse MoE architecture with 256 experts (4 active), it was pretrained on 17 trillion tokens and post-trained with instruction tuning and agentic RL.
Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.