Name: fairdataihub/Llama-3.1-8B-Poster-Extraction API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: fairdataihub

Model Overview

fairdataihub/Llama-3.1-8B-Poster-Extraction is an 8 billion parameter model built on the Llama 3.1 architecture, developed by the FAIR Data Innovations Hub. Its primary function is to transform raw text from scientific conference posters into structured JSON metadata. This model is the core component of the poster2json Python library and powers the posters.science platform, aiming to make scientific posters Findable, Accessible, Interoperable, and Reusable (FAIR).

Key Capabilities

Structured Metadata Extraction: Converts poster content into a detailed JSON format based on the poster-json-schema, an extension of the DataCite Metadata Schema.
Comprehensive Data Fields: Extracts critical information such as creators (authors, affiliations), titles, publicationYear, subjects, descriptions (abstracts), conference details, content.sections, imageCaptions, and tableCaptions.
High Performance: Achieves a 100% pass rate on a validation set of 10 manually annotated scientific posters, with high scores in Word Capture (0.96), ROUGE-L (0.89), and Number Capture (0.93).
Integration: Designed to be used via the poster2json Python library for easy integration into data processing pipelines.

Use Cases

This model is ideal for researchers and developers who need to:

Automate the extraction of structured metadata from scientific posters.
Populate databases or platforms (like posters.science) with FAIR-compliant poster information.
Facilitate searchability and discoverability of scientific poster content.
Process large volumes of poster data for analysis or archiving.

Overview

Model Overview

Key Capabilities

Use Cases

Full Model Card (README)