Model Overview
inference-net/Schematron-3B is a specialized 3.2 billion parameter model developed by Inference.net, designed for robust HTML-to-JSON extraction. It is part of the Schematron series, which focuses on converting noisy web content into strictly valid, schema-conformant JSON.
Key Capabilities
- Schema-First Extraction: Guarantees 100% schema-conformant JSON outputs, making it ideal for structured data extraction tasks.
- Long Context Handling: Processes lengthy and noisy HTML inputs effectively, supporting a context window of up to 128K tokens.
- Optimized for Cost-Efficiency: The 3B variant offers near-parity quality compared to the larger 8B model at approximately half the cost, making it the recommended default for most use cases.
- Input/Output: Takes cleaned HTML and a JSON Schema (conforming to schema.org) as input, and outputs strictly valid JSON without additional narration.
Performance Highlights
Evaluations using Gemini 2.5 Pro as a judge show Schematron-3B achieving a 4.41 score for HTML-to-JSON extraction quality, closely trailing Schematron-8B (4.64) and GPT-4.1 (4.74). In web-augmented factuality tests on SimpleQA, integrating Schematron with web retrieval significantly improved LLM accuracy, demonstrating its effectiveness in enhancing factual retrieval by providing structured data from web pages. It notably reduced the token count required for processing compared to raw HTML.
Good for
- Web Scraping: Efficiently extracts structured data from web pages.
- Data Ingestion: Transforms arbitrary web content into structured records for databases or analytics.
- Improving LLM Factuality: Can be integrated into pipelines to provide structured, factual data to general-purpose LLMs, significantly boosting their accuracy on knowledge-intensive tasks.