TAUR-dev/rankalign-v6-gemma-2-2b-d0.15-e2-hc-b2d-dbl-all-p0-nv1-ng1-fsx-sm0.1
TAUR-dev/rankalign-v6-gemma-2-2b-d0.15-e2-hc-b2d-dbl-all-p0-nv1-ng1-fsx-sm0.1 is a 2.6 billion parameter language model fine-tuned from Google's Gemma-2-2b base model. Developed as part of the rankalign project, this model is specifically trained for hypernym-related tasks, focusing on identifying and generating hierarchical relationships between concepts. Its training involved a unique configuration with a delta of 0.15 and two epochs, emphasizing specific preference and NLL generator weights. This model is optimized for research and evaluation in semantic hierarchy understanding.
Loading preview...
Overview
This model, rankalign-v6-gemma-2-2b-d0.15-e2-hc-b2d-dbl-all-p0-nv1-ng1-fsx-sm0.1, is a fine-tuned checkpoint derived from the rankalign project, utilizing Google's gemma-2-2b as its base architecture. It features 2.6 billion parameters and a context length of 8192 tokens.
Training Details
The model underwent specific fine-tuning for "hypernym-concat-bananas-to-dogs-double-all" tasks over two epochs. Key training parameters include a delta of 0.15, a preference loss weight of 0, and NLL validator/generator weights of 1.0. It also incorporated a semi-supervised ratio of 0.1 and enforced force-same-x during training, indicating a specialized approach to learning semantic relationships.
Key Capabilities
- Hypernym Identification: Specialized in tasks related to identifying hypernyms (broader categories) for given concepts.
- Semantic Hierarchy Understanding: Designed to process and generate text that reflects hierarchical relationships between words or phrases.
- Research & Evaluation: Primarily intended for research purposes within the
rankalignframework, particularly for evaluating performance on various hypernym-related datasets.
Good For
- Researchers exploring methods for improving language models' understanding of semantic hierarchies.
- Evaluating the effectiveness of different training configurations for hypernym generation and validation.
- Applications requiring precise identification of superordinate concepts.