expertai/SLIMER-PARALLEL-LLaMA3

Warm
Public
8B
FP8
32768
License: llama3.1
Hugging Face
Overview

SLIMER-PARALLEL-LLaMA3: Zero-Shot NER with Enhanced Parallelism

SLIMER-PARALLEL-LLaMA3 is an 8 billion parameter model built on the LLaMA-3 architecture, specifically fine-tuned for zero-shot Named Entity Recognition (NER) in English. This iteration of SLIMER demonstrates a +17% performance improvement over its LLaMA-2 based predecessor, particularly in handling novel entity types.

Key Capabilities & Differentiators

  • Zero-Shot NER: Designed to identify Named Entities (NEs) that it has not encountered during training, leveraging detailed definitions and guidelines provided within the prompt.
  • Parallel Extraction: Uniquely capable of extracting up to 16 Named Entities in parallel from a single prompt, enhancing efficiency for complex NER tasks.
  • Robustness to Out-of-Distribution (OOD) Data: Achieves comparable or superior performance to state-of-the-art models on OOD input domains, including specialized datasets like BUSTER (financial entities), where it outperforms models like GoLLIE and GLiNER-L.
  • Instruction-Tuned Methodology: Utilizes a lighter instruction tuning approach, focusing on enriching prompts with NE definitions and guidelines rather than extensive fine-tuning on a vast number of entity classes.

When to Use This Model

  • Dynamic NER Needs: Ideal for scenarios where the types of Named Entities to be extracted are not fixed or are frequently changing.
  • Novel Entity Discovery: Excellent for identifying new or domain-specific entity types without requiring re-training.
  • High-Throughput NER: The parallel extraction capability makes it suitable for applications requiring efficient processing of multiple entity types simultaneously.
  • Research in Zero-Shot Learning: A strong candidate for exploring and advancing zero-shot NER techniques, especially with its demonstrated performance on challenging OOD datasets.