Overview
Model Overview
inference-net/Schematron-8B is an 8 billion parameter model from the Schematron series, developed by Inference.net. It is a specialized long-context extraction model designed to convert noisy HTML into clean, typed JSON according to a provided schema. This model is particularly suited for web scraping, data ingestion, and transforming unstructured web content into structured records.
Key Capabilities
- Schema-first extraction: Guarantees 100% schema-conformant JSON outputs, ensuring data integrity and usability.
- Long context handling: Robustly processes lengthy and noisy HTML inputs, supporting up to 128K tokens (though the model itself has a 32K context window, the series supports up to 128K).
- HTML-to-JSON conversion: Efficiently transforms raw or cleaned HTML into strictly valid JSON, without additional narration.
- High extraction quality: Achieves an LLM-as-Judge score of 4.64 for HTML-to-JSON extraction, closely matching GPT-4.1.
- Enhanced factuality: When paired with web retrieval, Schematron significantly improves LLM factuality, boosting accuracy from 8.54% to 82.87% for models like GPT-5 Nano on tasks like SimpleQA.
Good For
- Web scraping: Ideal for extracting structured data from complex and noisy web pages.
- Data ingestion: Facilitates the conversion of arbitrary web content into structured records for databases or analytics.
- Improving LLM factuality: Can be integrated into pipelines to provide structured, factual data from web searches, enhancing the accuracy of primary LLMs.
- Deterministic JSON output: Ensures reliable and parseable JSON output, critical for automated data processing.
Limitations
- Processes static HTML only; client-side rendered content must be handled upstream.
- Very large pages may require truncation.
- Output quality can depend on the clarity and explicitness of the provided JSON schema.