OpenRubrics/RubricARM-8B-Rubric

TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:32kPublished:Feb 1, 2026Architecture:Transformer0.0K Cold

OpenRubrics/RubricARM-8B-Rubric is an 8 billion parameter language model, fine-tuned from Qwen3/Qwen3-8B, specifically designed for extracting rubric-style instructions from user requests. This model excels at generating universal, principle-based evaluation criteria, distinguishing between hard rules and abstract principles. Its primary application is to assist in creating comprehensive and concise rubrics for evaluating LLM responses, ensuring criteria are distinct and free of topic-specific references.

Loading preview...

Model Overview

OpenRubrics/RubricARM-8B-Rubric is an 8 billion parameter model, fine-tuned from the Qwen3/Qwen3-8B architecture. Its core function is to generate structured, rubric-style evaluation criteria from user prompts, as detailed in its accompanying research paper.

Key Capabilities

  • Rubric Extraction: Specializes in identifying and formulating evaluation rubrics from natural language requests.
  • Categorization: Distinguishes between two types of rubric items:
    • [Hard Rule]: Derived from explicit requirements (e.g., format, length, forbidden elements).
    • [Principle]: Abstracted, domain-agnostic quality criteria (e.g., clarity, correctness, sound reasoning).
  • Universality: Ensures all generated rubric items are universal principles, devoid of topic-specific references like names, places, or numbers.
  • Comprehensiveness: Aims to cover all critical aspects implied by a request, including both explicit and implicit quality standards.
  • Conciseness & Uniqueness: Merges redundant criteria and ensures each rubric item captures a distinct evaluation point with precise wording.
  • Structured Output: Formats rubrics as a numbered list, with each item starting "The response" and appended with [Hard Rule] or [Principle].

Ideal Use Cases

This model is particularly well-suited for developers and researchers focused on:

  • Automating the creation of evaluation rubrics for LLM outputs.
  • Establishing consistent and objective criteria for assessing response quality.
  • Developing systems that require structured feedback mechanisms based on user instructions.
  • Applications where abstracting specific requirements into universal principles is crucial for evaluation.