OrionLLM/GRM-7b

TEXT GENERATIONConcurrency Cost:1Model Size:7.6BQuant:FP8Ctx Length:32kPublished:Mar 13, 2026License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

GRM-7b by OrionLLM is a 7 billion parameter general-purpose language model specifically fine-tuned for multi-domain reasoning, encompassing math, logic, coding, and broad problem-solving. This model is designed to serve as a strong, practical daily driver for general reasoning tasks and as an excellent base for further fine-tuning. It demonstrates dedicated reasoning behavior for stepwise problem-solving and improved consistency across various complex tasks.

Loading preview...

OrionLLM/GRM-7b: A Reasoning-Focused 7B Model

GRM-7b is a 7 billion parameter model developed by OrionLLM, engineered with a primary focus on enhancing multi-domain reasoning capabilities. It excels across diverse areas including mathematics, logic, coding, and general problem-solving, making it a versatile tool for complex analytical tasks.

Key Capabilities

  • Dedicated Reasoning Behavior: Optimized for general tasks requiring stepwise problem-solving and improved consistency in outputs.
  • Strong 7B-Scale Performance: Offers practical performance suitable for local inference and experimentation, balancing capability with accessibility.
  • Multi-Domain Mixture: Trained on a diverse dataset incorporating reasoning, code, math, and medical reasoning data, broadening its applicability.
  • Fine-Tune Friendly: Designed as an ideal starting point for custom Supervised Fine-Tuning (SFT), Grouped Reinforcement Learning from Human Feedback (GRPO), or Direct Preference Optimization (DPO) pipelines.

Benchmarks

GRM-7b demonstrates strong performance across various reasoning and coding benchmarks, often outperforming other models in its class. Notably, it achieves 69.0 on AIME24, 53.3 on AIME25, 93.5 on AMC23, and 90.0 on MATH500, highlighting its robust analytical prowess. It also shows competitive results in coding challenges like CodeElo and CodeForces, and strong performance in GPQA-D and JEEBench.

Good For

  • Developers needing a reliable 7B model for general reasoning tasks.
  • Researchers and practitioners looking for a solid base model for further fine-tuning on specific reasoning-intensive applications.
  • Use cases requiring multi-domain problem-solving, including mathematical, logical, and coding challenges.