TAUR-dev/rankalign-v6-gemma-2-2b-d0.15-e2-hc-b2d-dbl-all-tcs-ln-p0-nv1-ng1-fsx-sm0.1
The TAUR-dev/rankalign-v6-gemma-2-2b-d0.15-e2-hc-b2d-dbl-all-tcs-ln-p0-nv1-ng1-fsx-sm0.1 model is a 2.6 billion parameter fine-tuned checkpoint based on the Google Gemma-2-2b architecture. Developed as part of the rankalign project, this model is specifically trained for hypernym-related tasks, focusing on identifying hierarchical relationships between concepts. It incorporates self-typicality correction and length normalization during its fine-tuning process, making it suitable for specialized semantic understanding applications.
Loading preview...
Overview
This model, rankalign-v6-gemma-2-2b-d0.15-e2-hc-b2d-dbl-all-tcs-ln-p0-nv1-ng1-fsx-sm0.1, is a specialized fine-tuned version of the Google Gemma-2-2b base model, developed within the rankalign project. It features 2.6 billion parameters and a context length of 8192 tokens.
Key Training Details
The model underwent specific fine-tuning for a task identified as hypernym-concat-bananas-to-dogs-double-all. Notable training parameters include:
- Base Model:
google/gemma-2-2b - Version: v6
- Epochs: 2
- Delta: 0.15
- Typicality Correction: Self-correction mechanism
- Length Normalization: Enabled
- Preference Loss Weight: 0
- NLL Validator/Generator Weight: 1 for both
- Force Same-X: True
- Semi-supervised Ratio: 0.1
Use Cases and Evaluation
This model is designed for tasks involving hypernym identification and understanding, as indicated by its training task and the provided evaluation scripts. The evaluation process, demonstrated with eval_by_claude.py scripts, focuses on various hypernym-related tasks such as hypernym-bananas, hypernym-dogs, hypernym-cars, and others. Developers can use these scripts to reproduce evaluations and assess the model's performance on similar semantic relationship tasks.