TAUR-dev/rankalign-v6-gemma-2-2b-d0.15-e2-hc-b2d-dbl-all-tcs-ln-nv1-ng1-vlo-fsx-sm0.1

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:2.6BQuant:BF16Ctx Length:8kPublished:Apr 6, 2026Architecture:Transformer Warm

The TAUR-dev/rankalign-v6-gemma-2-2b-d0.15-e2-hc-b2d-dbl-all-tcs-ln-nv1-ng1-vlo-fsx-sm0.1 model is a 2.6 billion parameter Gemma-2-2b base model fine-tuned using the rankalign project. This checkpoint is specifically trained for hypernym-concat-bananas-to-dogs-double-all tasks, incorporating typicality correction and length normalization. It is designed for specialized linguistic tasks involving hierarchical relationships between concepts, leveraging a preference loss and NLL validator/generator weights.

Loading preview...

Model Overview

This model, rankalign-v6-gemma-2-2b-d0.15-e2-hc-b2d-dbl-all-tcs-ln-nv1-ng1-vlo-fsx-sm0.1, is a fine-tuned checkpoint derived from the google/gemma-2-2b base model, part of the rankalign project. It features 2.6 billion parameters and is specifically optimized for tasks related to identifying and concatenating hypernyms.

Key Training Details

The model underwent 2 epochs of fine-tuning with a delta of 0.15. Notable training configurations include:

  • Task: hypernym-concat-bananas-to-dogs-double-all
  • Typicality Correction: Self-correction mechanism applied.
  • Length Normalization: Enabled to adjust for sequence length biases.
  • Loss Weights: Utilizes both preference loss (weight 1) and NLL validator/generator loss (weight 1 each).
  • Validator Log-Odds: Enabled for improved validation.
  • Semi-supervised Ratio: Trained with a 0.1 semi-supervised ratio.

Use Cases

This model is particularly suited for research and applications requiring precise identification and manipulation of hypernymic relationships within text. Its specialized training makes it a candidate for tasks involving semantic hierarchy and conceptual categorization, especially within the domains it was trained on (e.g., hypernym-bananas, hypernym-dogs, etc.). The provided evaluation scripts demonstrate its intended use for assessing performance on various hypernym tasks.