inference-net/Schematron-3B

Warm
Public
3.2B
BF16
32768
Aug 21, 2025
License: llama3.2
Hugging Face
Overview

Model Overview

Schematron-3B, developed by Inference.net, is a specialized 3 billion parameter model designed for converting noisy HTML into strictly schema-conformant JSON. It is part of the Schematron series, which focuses on long-context extraction for web scraping, data ingestion, and transforming unstructured web content into structured records.

Key Capabilities

  • Schema-first extraction: Guarantees 100% schema-conformant JSON outputs, ensuring data integrity and usability.
  • Long context: Handles lengthy and noisy HTML pages with a context window of up to 128K tokens, making it suitable for complex web documents.
  • Cost-efficient: Schematron-3B offers near-parity quality compared to its larger 8B variant at approximately 50% of the cost, making it the recommended default for most use cases.
  • Web-Augmented Factuality: When integrated into a pipeline with web search, Schematron significantly improves LLM factuality by providing structured data for answer synthesis. For instance, it improved GPT-5 Nano's accuracy from 8.54% to 82.87% on SimpleQA.

Good for

  • Web Scraping: Efficiently extracts structured data from websites, even with complex or noisy HTML.
  • Data Ingestion: Transforms arbitrary web pages into clean, typed JSON records for database storage or further processing.
  • Improving LLM Factuality: Can be used in conjunction with other LLMs and web search to provide structured, factual data, enhancing the accuracy of generated responses.
  • Deterministic Output: Ideal for applications requiring strictly valid and predictable JSON outputs, such as API integrations or automated data pipelines.