nvidia/Nemotron-Orchestrator-8B

TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:32kTool Calling:SupportedPublished:Nov 25, 2025Architecture:Transformer0.6K Cold

NVIDIA and University of Hong Kong developed Nemotron-Orchestrator-8B, an 8-billion parameter orchestration model built on Qwen3-8B. This model is designed to solve complex, multi-turn agentic tasks by efficiently coordinating diverse expert models and tools. It achieves 37.1% on the Humanity's Last Exam benchmark, outperforming GPT-5 while being significantly more efficient in terms of cost and speed. Nemotron-Orchestrator-8B excels at intelligent orchestration and robust generalization to unseen tools and configurations.

Loading preview...

Nemotron-Orchestrator-8B: Intelligent Tool Orchestration

Nemotron-Orchestrator-8B, developed by NVIDIA and the University of Hong Kong, is an 8-billion parameter orchestration model based on Qwen3-8B. It is specifically engineered to manage and coordinate heterogeneous toolsets, including basic tools like search and code execution, as well as other specialized and generalist LLMs, to tackle complex, multi-turn agentic tasks.

Key Capabilities

  • Intelligent Orchestration: Manages diverse toolsets and LLMs to solve intricate problems.
  • Efficiency: Achieves higher accuracy at significantly lower computational cost compared to monolithic models, demonstrating 2.5x faster performance and 30% monetary cost relative to GPT-5 on the HLE benchmark.
  • Multi-Objective RL Training: Utilizes Group Relative Policy Optimization (GRPO) with a novel reward function that optimizes for accuracy, latency/cost, and adherence to user preferences.
  • Robust Generalization: Shows strong ability to adapt to unseen tools and pricing configurations.
  • Benchmark Performance: Scores 37.1% on the Humanity's Last Exam (HLE) benchmark, surpassing GPT-5 (35.1%), and consistently outperforms strong monolithic systems on FRAMES and \u03c4\u00b2-Bench.

Good For

  • Research and Development: This model is intended for research and development purposes in agentic AI systems.
  • Complex Agentic Workflows: Ideal for applications requiring the coordination of multiple tools and models to achieve multi-step objectives.
  • Cost-Efficient Solutions: Suitable for scenarios where high performance is needed without the high computational and monetary costs associated with larger, monolithic frontier models.