Name: carsonarkova/nessie-v5-llama-3.1-8b API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: carsonarkova

Overview

Nessie v5 is Arkova's specialized credential metadata extraction model, built upon the Meta Llama 3.1 8B Instruct base model. It is fine-tuned for structured data extraction from PII-stripped document text, leveraging a 32,768 token context length.

Key Capabilities

Specialized Extraction: Designed to extract structured metadata from various credential types, including DEGREE, LICENSE, CERTIFICATE, FINANCIAL, LEGAL, and more.
Domain-Specific Adapters: Incorporates LoRA adapters trained on extensive corpora for SEC filings (45K examples), Academic documents (45K examples), Legal texts (13K examples), and Regulatory documents (13K examples).
Performance: Achieves a Weighted F1 score of 87.2% and a Macro F1 of 75.7% on its validation set for metadata extraction.
PII-Stripped Processing: Intended for use with pre-processed text where personally identifiable information has been removed.

Good For

Automated Credential Processing: Ideal for applications requiring the extraction of specific metadata fields from a wide range of credential documents.
Legal and Financial Document Analysis: Particularly strong in domains like SEC filings, legal documents, and academic records due to its specialized training.
Structured Data Generation: Useful for converting unstructured credential text into structured, queryable data formats.

Important Note

This model requires the use of its trained condensed prompt (~1.5K characters); using the full extraction prompt (58K characters) will result in 0% F1 due to a prompt template mismatch.