OrionLLM/GRM-2.6-Opus
OrionLLM/GRM-2.6-Opus is a 27 billion parameter general-purpose AI model developed by OrionLLM, optimized for difficult, high-complexity tasks with a 32768 token context length. This model merges OrionLLM/GRM-2.6-Plus with rico03/Qwen3.6-27B-Claude-Opus-Reasoning-Distilled, adopting an Opus-style reasoning format for structured, organized, and deliberate problem-solving. It excels in terminal agents, coding workflows, and complex STEM evaluation, offering stronger performance for its size in structured reasoning and agentic capabilities.
Loading preview...
OrionLLM/GRM-2.6-Opus: Advanced Reasoning for Complex Tasks
GRM-2.6-Opus is a 27 billion parameter general-purpose AI model developed by OrionLLM, designed for difficult, high-complexity tasks. It is a merge of OrionLLM/GRM-2.6-Plus and rico03/Qwen3.6-27B-Claude-Opus-Reasoning-Distilled, incorporating an "Opus-style" reasoning format for more structured and deliberate problem-solving.
Key Capabilities
- Opus-Style Structured Reasoning: Produces clearer and more reliable solutions for complex tasks through an organized reasoning format.
- Improved Terminal Agent Ability: Enhanced for terminal-based agents, tool-style workflows, debugging, code execution planning, and multi-step technical tasks.
- Stronger Coding Performance: Improves code reasoning, implementation planning, and handling of difficult programming tasks.
- Enhanced General-Purpose Intelligence: Remains effective across research, STEM, chat, coding, local agents, and advanced problem-solving.
- Improved Over GRM-2.6-Plus: Builds upon its predecessor with stronger structured reasoning behavior.
Performance Highlights
GRM-2.6-Opus aims to be a highly capable local AI model for complex reasoning, coding, and agentic workflows, delivering better performance for its size. Its practical intelligence is demonstrated through structured reasoning, strong task understanding, improved coding behavior, and stable responses across multiple domains. For instance, in the GPQA Diamond benchmark, GRM-2.6-Opus achieves 89.2, outperforming GRM-2.6-Plus (88.3), Qwen3.6-27B (87.8), google/gemma-4-31B-it (84.3), and Claude-4.5-Haiku (73.0), while being comparable to GPT-5.4-Mini (88.0).