dyyyyyyyy/GNER-LLaMA-7B
The dyyyyyyyy/GNER-LLaMA-7B is a 7 billion parameter LLaMA-based generative language model developed by Yuyang Ding et al. for Named Entity Recognition (NER). This model is specifically fine-tuned using a novel framework that integrates negative instances into the training process, significantly enhancing its zero-shot NER capabilities. It achieves an average F1 score of 66.1 in zero-shot settings and 86.09 in supervised settings, outperforming state-of-the-art approaches by a substantial margin. GNER-LLaMA-7B is optimized for accurate and robust generative NER, particularly in scenarios requiring strong zero-shot performance across unseen entity domains.
Loading preview...
GNER-LLaMA-7B: Enhanced Generative Named Entity Recognition
GNER-LLaMA-7B is a 7 billion parameter model based on the LLaMA architecture, developed as part of the Generative Named Entity Recognition (GNER) framework. This model introduces a novel approach to NER by incorporating negative instances into its training process, which is crucial for improving its ability to identify entities without prior examples.
Key Capabilities and Performance
- Enhanced Zero-Shot NER: GNER-LLaMA-7B demonstrates significantly improved zero-shot capabilities, achieving an average F1 score of 66.1 on unseen entity domains. This represents an 8-point improvement over previous state-of-the-art methods.
- Strong Supervised Performance: In supervised settings, the model achieves an average F1 score of 86.09, showcasing its robust performance across various NER tasks.
- Generative Approach: Unlike traditional discriminative NER models, GNER-LLaMA-7B leverages a generative approach, allowing for more flexible and context-aware entity extraction.
- Open-Source Availability: The model, along with its code and research paper, is publicly available, facilitating research and application development.
Use Cases and Differentiators
- Zero-Shot Entity Extraction: Ideal for scenarios where labeled data for specific entity types is scarce or non-existent, enabling robust NER in new domains.
- Research and Development: Provides a strong baseline and tool for further research into generative NER and the impact of negative instance training.
- Comparison to Other Models: The GNER framework, when applied to LLaMA, shows superior performance compared to other generative models, particularly in zero-shot contexts, making it a strong candidate for advanced NER applications.