The nvidia/Nemotron-Orchestrator-8B is an 8 billion parameter orchestration model developed by NVIDIA and the University of Hong Kong, designed to manage and coordinate diverse expert models and tools for complex, multi-turn agentic tasks. Trained with Multi-Objective Reinforcement Learning (GRPO), it optimizes for accuracy, latency, and cost efficiency. This model achieves 37.1% on the Humanity's Last Exam (HLE) benchmark, outperforming GPT-5 while being approximately 2.5 times more efficient, making it suitable for robust tool orchestration and versatile reasoning.
No reviews yet. Be the first to review!