fairdataihub/Llama-3.1-8B-Poster-Extraction

TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:32kTool Calling:SupportedPublished:Jan 7, 2026License:llama3.1Architecture:Transformer0.0K Cold

The fairdataihub/Llama-3.1-8B-Poster-Extraction is an 8 billion parameter Llama 3.1 architecture model developed by the FAIR Data Innovations Hub. It is specifically fine-tuned for extracting structured JSON metadata from scientific conference posters, conforming to an extended DataCite-based schema. This model excels at converting raw poster text into detailed metadata, including author information, titles, conference details, content sections, and captions, with a 32K token context length.

Loading preview...

Model Overview

fairdataihub/Llama-3.1-8B-Poster-Extraction is an 8 billion parameter model built on the Llama 3.1 architecture, developed by the FAIR Data Innovations Hub. Its primary function is to transform raw text from scientific conference posters into structured JSON metadata. This model is the core component of the poster2json Python library and powers the posters.science platform, aiming to make scientific posters Findable, Accessible, Interoperable, and Reusable (FAIR).

Key Capabilities

  • Structured Metadata Extraction: Converts poster content into a detailed JSON format based on the poster-json-schema, an extension of the DataCite Metadata Schema.
  • Comprehensive Data Fields: Extracts critical information such as creators (authors, affiliations), titles, publicationYear, subjects, descriptions (abstracts), conference details, content.sections, imageCaptions, and tableCaptions.
  • High Performance: Achieves a 100% pass rate on a validation set of 10 manually annotated scientific posters, with high scores in Word Capture (0.96), ROUGE-L (0.89), and Number Capture (0.93).
  • Integration: Designed to be used via the poster2json Python library for easy integration into data processing pipelines.

Use Cases

This model is ideal for researchers and developers who need to:

  • Automate the extraction of structured metadata from scientific posters.
  • Populate databases or platforms (like posters.science) with FAIR-compliant poster information.
  • Facilitate searchability and discoverability of scientific poster content.
  • Process large volumes of poster data for analysis or archiving.