Orionfold/patent-strategist-v3-nemo

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:32kPublished:May 22, 2026License:apache-2.0Architecture:Transformer Open Weights Warm

The Orionfold/patent-strategist-v3-nemo model is a fine-tuned version of deepseek-ai/DeepSeek-R1-0528-Qwen3-8B, specifically optimized for offline patent prosecution reasoning. Trained with NeMo Framework on a 5,000-row synthetic patent-reasoning corpus, it provides chain-of-thought reasoning for complex legal tasks. This BF16 merged weights model is designed for deployment on Spark-class hardware (e.g., NVIDIA DGX Spark with 128 GB unified memory) for secure, offline workflows.

Loading preview...

Patent Strategist v3 - NeMo Framework Lane

This model, patent-strategist-v3-nemo, is a fine-tuned variant of deepseek-ai/DeepSeek-R1-0528-Qwen3-8B, developed by Orionfold LLC. It leverages NeMo Framework for training and provides BF16 merged weights, optimized for production-grade inference via Triton/TensorRT-LLM or further fine-tuning within NeMo's PEFT recipe stack.

Key Capabilities

This model excels at offline patent-prosecution reasoning, enabling secure workflows for privileged client text without relying on hosted frontier APIs. It distills DeepSeek-R1's chain-of-thought reasoning onto a specialized corpus, allowing for full IRAC-shaped reasoning chains on Spark-class hardware. Specific use cases include:

  • Claim construction: Handling Markush groups and doctrine of equivalents.
  • Office-action responses: Drafting MPEP-grounded arguments.
  • Prior-art analysis: Assessing relevance and non-obviousness reasoning chains.
  • Patent-licensing: Analyzing scenarios like most-favored-licensee and FTO.

Performance and Training

During bakeoff, this NeMo Framework lane achieved a training wall time of 5 hours 38 minutes, outperforming an Unsloth baseline. It demonstrated a probe think rate of 0.80 and a mean chain length of 1,320 tokens, indicating a 44% increase in reasoning depth over the baseline. The model is intended for patent attorneys, prosecution-team engineers, and IP-strategy teams requiring offline processing on hardware like NVIDIA DGX Spark (GB10, 128 GB unified memory).

Known Limitations

Observed bounded limitations include minor terminology drifts (e.g., "metes-and-times" instead of "metes and bounds") and a fabricated MPEP citation (§2163.05(s)), both identified as corpus-generator artifacts rather than model-wide hallucination patterns.