Model Overview

This model, rankalign-v6-gemma-2-2b-d0.15-e2-hc-b2d-dbl-all-nv1-ng1-vlo-fsx, is a fine-tuned checkpoint derived from the rankalign project, built upon the google/gemma-2-2b base model. It features 2.6 billion parameters and a context length of 8192 tokens. The fine-tuning process, designated as version v6, was specifically aimed at a hypernym-concat-bananas-to-dogs-double-all task, indicating a focus on identifying hypernym relationships across a diverse set of concepts.

Training Details & Key Characteristics

Base Model: google/gemma-2-2b
Version: v6, trained for 2 epochs with a delta of 0.15.
Task Specialization: Optimized for hypernym prediction, specifically within a concatenated dataset ranging from "bananas to dogs."
Loss Functions: Incorporates both preference loss and NLL (Negative Log-Likelihood) for validator and generator components, with equal weighting.
Validation: Utilizes validator log-odds and enforces force-same-x during training, suggesting a focus on robust and consistent predictions.

Use Cases

This model is particularly suited for research and applications requiring:

Hypernym Extraction: Identifying 'is-a' relationships between words or concepts.
Semantic Hierarchy Understanding: Analyzing and mapping hierarchical structures in text.
Lexical Semantics Research: Investigating how models learn and represent semantic relations.

Reproducibility scripts are provided for evaluating its performance across various hypernym tasks, such as hypernym-bananas, hypernym-dogs, and hypernym-elephants, using a random split type and few-shot discrimination.

Overview

Model Overview

Training Details & Key Characteristics

Use Cases

Full Model Card (README)