dyyyyyyyy/GNER-LLaMA-7B

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Feb 27, 2024License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

The dyyyyyyyy/GNER-LLaMA-7B is a 7 billion parameter LLaMA-based generative language model developed by Yuyang Ding et al. for Named Entity Recognition (NER). This model is specifically fine-tuned using a novel framework that integrates negative instances into the training process, significantly enhancing its zero-shot NER capabilities. It achieves an average F1 score of 66.1 in zero-shot settings and 86.09 in supervised settings, outperforming state-of-the-art approaches by a substantial margin. GNER-LLaMA-7B is optimized for accurate and robust generative NER, particularly in scenarios requiring strong zero-shot performance across unseen entity domains.

Loading preview...

GNER-LLaMA-7B: Enhanced Generative Named Entity Recognition

GNER-LLaMA-7B is a 7 billion parameter model based on the LLaMA architecture, developed as part of the Generative Named Entity Recognition (GNER) framework. This model introduces a novel approach to NER by incorporating negative instances into its training process, which is crucial for improving its ability to identify entities without prior examples.

Key Capabilities and Performance

  • Enhanced Zero-Shot NER: GNER-LLaMA-7B demonstrates significantly improved zero-shot capabilities, achieving an average F1 score of 66.1 on unseen entity domains. This represents an 8-point improvement over previous state-of-the-art methods.
  • Strong Supervised Performance: In supervised settings, the model achieves an average F1 score of 86.09, showcasing its robust performance across various NER tasks.
  • Generative Approach: Unlike traditional discriminative NER models, GNER-LLaMA-7B leverages a generative approach, allowing for more flexible and context-aware entity extraction.
  • Open-Source Availability: The model, along with its code and research paper, is publicly available, facilitating research and application development.

Use Cases and Differentiators

  • Zero-Shot Entity Extraction: Ideal for scenarios where labeled data for specific entity types is scarce or non-existent, enabling robust NER in new domains.
  • Research and Development: Provides a strong baseline and tool for further research into generative NER and the impact of negative instance training.
  • Comparison to Other Models: The GNER framework, when applied to LLaMA, shows superior performance compared to other generative models, particularly in zero-shot contexts, making it a strong candidate for advanced NER applications.