inference-net/Schematron-8B

Cold
Public
8B
FP8
32768
Hugging Face
Overview

Model Overview

inference-net/Schematron-8B is an 8 billion parameter model from the Schematron series, developed by Inference.net. It is a specialized long-context extraction model designed to convert noisy HTML into clean, typed JSON according to a provided schema. This model is particularly suited for web scraping, data ingestion, and transforming unstructured web content into structured records.

Key Capabilities

  • Schema-first extraction: Guarantees 100% schema-conformant JSON outputs, ensuring data integrity and usability.
  • Long context handling: Robustly processes lengthy and noisy HTML inputs, supporting up to 128K tokens (though the model itself has a 32K context window, the series supports up to 128K).
  • HTML-to-JSON conversion: Efficiently transforms raw or cleaned HTML into strictly valid JSON, without additional narration.
  • High extraction quality: Achieves an LLM-as-Judge score of 4.64 for HTML-to-JSON extraction, closely matching GPT-4.1.
  • Enhanced factuality: When paired with web retrieval, Schematron significantly improves LLM factuality, boosting accuracy from 8.54% to 82.87% for models like GPT-5 Nano on tasks like SimpleQA.

Good For

  • Web scraping: Ideal for extracting structured data from complex and noisy web pages.
  • Data ingestion: Facilitates the conversion of arbitrary web content into structured records for databases or analytics.
  • Improving LLM factuality: Can be integrated into pipelines to provide structured, factual data from web searches, enhancing the accuracy of primary LLMs.
  • Deterministic JSON output: Ensures reliable and parseable JSON output, critical for automated data processing.

Limitations

  • Processes static HTML only; client-side rendered content must be handled upstream.
  • Very large pages may require truncation.
  • Output quality can depend on the clarity and explicitness of the provided JSON schema.