FrancescoPeriti/Llama3Dictionary-merge
FrancescoPeriti/Llama3Dictionary-merge is an 8 billion parameter language model developed by Francesco Periti, integrating a fine-tuned Llama 3 model with Meta-Llama-3-8B-Instruct. This model is specifically fine-tuned on English datasets to generate concise sense definitions for target words within a given usage example. It functions as a dictionary, providing in-context definitions rather than selecting from a list, and has achieved new state-of-the-art results in definition generation and lexical semantics tasks.
Loading preview...
Model Overview
FrancescoPeriti/Llama3Dictionary-merge is an 8 billion parameter model that combines a specialized fine-tuned Llama 3 variant with the base meta-llama/Meta-Llama-3-8B-Instruct model. Developed by Francesco Periti, David Alfter, and Nina Tahmasebi, this model is designed to generate in-context sense definitions for English words, as detailed in their paper "Automatically Generated Definitions and their utility for Modeling Word Meaning" (https://aclanthology.org/2024.emnlp-main.776/).
Key Capabilities
- In-Context Definition Generation: Given a target word and a usage example, the model produces a precise sense definition for that specific context.
- Lexicographical Functionality: It operates like a dictionary, but instead of offering a list of definitions, it directly provides the relevant sense.
- Research-Oriented: Primarily intended for research purposes in lexical semantics.
- State-of-the-Art Performance: Achieves new state-of-the-art results in the Definition Generation task and improves performance on lexical semantics tasks such as Word-in-Context, Word Sense Induction, and Lexical Semantic Change.
Use Cases
- Lexicography Research: Ideal for studies involving automatic definition generation and understanding word meaning.
- NLP Applications: Can be integrated into systems requiring precise, context-aware word definitions.
Limitations
- English-Only: Fine-tuned exclusively on English datasets.
- Potential for Bias: Generated definitions may reflect biases and stereotypes present in the underlying language model and training data.