occ-ai/OCC-RAG-1.7B

TEXT GENERATIONConcurrency Cost:1Model Size:2BQuant:BF16Ctx Length:32kTool Calling:SupportedPublished:May 29, 2026License:mitArchitecture:Transformer0.0K Open Weights Cold

OCC-RAG-1.7B is a 1.7 billion parameter small language model developed by occ-ai, specialized for faithful, context-grounded question answering. Mid-trained from Qwen/Qwen3-1.7B-Base, it produces structured reasoning traces with explicit source citations and can abstain when context does not support an answer. This model matches or exceeds general-purpose models 2-6x larger on multi-hop reasoning, faithfulness, and refusal benchmarks, achieving the best faithfulness across all evaluated scales up to 32B parameters. It is designed for applications requiring transparent, verifiable answers directly from provided sources.

Loading preview...

OCC-RAG-1.7B: Specialized for Faithful Question Answering

OCC-RAG-1.7B is a 1.7 billion parameter model from the Optimal Cognitive Core (OCC) family, specifically designed for faithful, context-grounded question answering. It is mid-trained from Qwen/Qwen3-1.7B-Base on a synthetic corpus of ~3.25M QA pairs, emphasizing multi-hop and multi-context reasoning.

Key Capabilities & Differentiators

  • Faithful by Design: Answers exclusively from provided context, achieving the lowest memorization ratio (5.0 on ConFiQA) across all evaluated models, including those up to 32B parameters.
  • Calibrated Abstention: Outputs Not enough information when the context does not support an answer, enhancing reliability.
  • Structured, Citable Reasoning: Generates transparent reasoning traces (query analysis → source analysis → reasoning → status → answer) with explicit source citations.
  • Compact & Efficient: Despite its small size, it rivals or surpasses larger models (2-6x larger) in multi-hop reasoning, faithfulness, and refusal benchmarks, offering chain-of-thought-level transparency at a lower computational cost.
  • Performance: Closes the gap with Qwen3-4B on multi-hop reasoning and achieves superior faithfulness and refusal accuracy compared to many larger models.

Ideal Use Cases

  • Retrieval-Augmented Generation (RAG) Systems: Perfect for applications where answers must be strictly grounded in provided documents.
  • Verifiable QA: Suitable for scenarios requiring transparent, auditable answers with source attribution.
  • Resource-Constrained Environments: Its compact size allows for deployment in constrained infrastructure, including desktop systems.

Limitations

  • Context-Grounding Only: The model is trained to answer solely from supplied sources and does not leverage parametric knowledge.
  • Reasoning Depth: Optimized for up to three-hop reasoning; longer chains may be out of distribution.