Overview
SLIMER-PARALLEL-LLaMA3: Zero-Shot NER with Enhanced Parallelism
SLIMER-PARALLEL-LLaMA3 is an 8 billion parameter model built on the LLaMA-3 architecture, specifically fine-tuned for zero-shot Named Entity Recognition (NER) in English. This iteration of SLIMER demonstrates a +17% performance improvement over its LLaMA-2 based predecessor, particularly in handling novel entity types.
Key Capabilities & Differentiators
- Zero-Shot NER: Designed to identify Named Entities (NEs) that it has not encountered during training, leveraging detailed definitions and guidelines provided within the prompt.
- Parallel Extraction: Uniquely capable of extracting up to 16 Named Entities in parallel from a single prompt, enhancing efficiency for complex NER tasks.
- Robustness to Out-of-Distribution (OOD) Data: Achieves comparable or superior performance to state-of-the-art models on OOD input domains, including specialized datasets like BUSTER (financial entities), where it outperforms models like GoLLIE and GLiNER-L.
- Instruction-Tuned Methodology: Utilizes a lighter instruction tuning approach, focusing on enriching prompts with NE definitions and guidelines rather than extensive fine-tuning on a vast number of entity classes.
When to Use This Model
- Dynamic NER Needs: Ideal for scenarios where the types of Named Entities to be extracted are not fixed or are frequently changing.
- Novel Entity Discovery: Excellent for identifying new or domain-specific entity types without requiring re-training.
- High-Throughput NER: The parallel extraction capability makes it suitable for applications requiring efficient processing of multiple entity types simultaneously.
- Research in Zero-Shot Learning: A strong candidate for exploring and advancing zero-shot NER techniques, especially with its demonstrated performance on challenging OOD datasets.