TAUR-dev/rankalign-v6-gemma-2-2b-d0.15-e2-hc-b2d-dbl-all-tcs-vlo-fsx-lo0.1
TAUR-dev/rankalign-v6-gemma-2-2b-d0.15-e2-hc-b2d-dbl-all-tcs-vlo-fsx-lo0.1 is a 2.6 billion parameter Gemma-2-2b based model fine-tuned as part of the rankalign project. This model is specifically optimized for hypernym prediction tasks, focusing on identifying hierarchical relationships between concepts. It utilizes a unique training methodology involving hypernym concatenation and self-typicality correction to enhance its performance in semantic relation extraction. The model is particularly suited for research and applications requiring precise understanding and generation of hypernyms.
Loading preview...
Overview
This model, rankalign-v6-gemma-2-2b-d0.15-e2-hc-b2d-dbl-all-tcs-vlo-fsx-lo0.1, is a fine-tuned checkpoint derived from the google/gemma-2-2b base model. It is part of the rankalign project, which focuses on advanced techniques for language model alignment and preference learning.
Training Details
The model underwent 2 epochs of fine-tuning with a delta of 0.15. Key training parameters include:
- Task:
hypernym-concat-bananas-to-dogs-double-all, indicating a specialized focus on hypernym identification. - Typicality correction: Self-correction mechanism was employed.
- Preference loss weight: Set to 1, emphasizing preference learning during training.
- Validator log-odds: Enabled, suggesting a discriminative component in its training objective.
- Labeled-only ratio: 0.1, indicating a specific mix of supervised and potentially unsupervised data usage.
Use Cases
This model is primarily designed for research and development in:
- Hypernym prediction: Identifying 'is-a' relationships between words or concepts.
- Semantic relation extraction: Tasks requiring the understanding of hierarchical semantic structures.
- Linguistic analysis: Exploring how models learn and represent conceptual hierarchies.
Reproducibility
The README provides detailed evaluation scripts using eval_by_claude.py across various hypernym tasks (e.g., hypernym-bananas, hypernym-dogs, hypernym-cars). These scripts demonstrate how to assess the model's performance in zero-shot generation and few-shot discrimination settings, utilizing validator log-odds and self-typicality for robust evaluation.