Name: numind/NuExtract API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: numind

NuExtract: Specialized Information Extraction

NuExtract, developed by NuMind, is a 4 billion parameter model built upon Microsoft's Phi-3-mini-4k-instruct architecture. It is specifically fine-tuned on a high-quality synthetic dataset to perform structured information extraction from text.

Key Capabilities

Purely Extractive: Guarantees that all extracted text is directly present in the original input, preventing hallucination of information.
JSON Template-Driven: Users provide a JSON template to define the desired output structure, enabling precise and customizable data extraction.
Example-Based Guidance: Supports providing output formatting examples to further refine extraction accuracy for complex tasks.
Context Length: Processes inputs up to 4096 tokens, suitable for various document lengths.

Good For

Structured Data Extraction: Ideal for converting unstructured text into structured JSON formats.
Automated Data Processing: Useful for tasks requiring the precise retrieval of specific entities or facts from documents.
Customizable Extraction: Adapts to diverse extraction needs through user-defined JSON schemas.

NuMind also offers smaller (0.5B) and larger (7B) versions of this model, NuExtract-tiny and NuExtract-large, respectively, to cater to different computational and performance requirements.