rbelanec/train_mnli_42_1775732963
The rbelanec/train_mnli_42_1775732963 model is a 1 billion parameter language model fine-tuned from meta-llama/Llama-3.2-1B-Instruct. It is specifically optimized for natural language inference tasks, having been trained on the MNLI dataset. This model demonstrates a validation loss of 0.1219 on the evaluation set, indicating its proficiency in determining entailment, contradiction, or neutrality between sentence pairs.
Loading preview...
Model Overview
This model, rbelanec/train_mnli_42_1775732963, is a 1 billion parameter language model derived from the meta-llama/Llama-3.2-1B-Instruct architecture. It has undergone specific fine-tuning on the MNLI (Multi-Genre Natural Language Inference) dataset, making it specialized for natural language inference tasks.
Key Capabilities
- Natural Language Inference (NLI): The model is designed to classify the relationship between a premise and a hypothesis as entailment, contradiction, or neutral.
- Performance: Achieved a validation loss of 0.1219 on the evaluation set after 1.0 epoch of training, with a total of 191,491,960 input tokens seen during the entire training process.
Training Details
The model was trained with a learning rate of 5e-06, using AdamW_Torch optimizer, and a cosine learning rate scheduler with a 0.1 warmup ratio over 5 epochs. Batch sizes for both training and evaluation were set to 8. The training process utilized Transformers 4.51.3, Pytorch 2.10.0+cu128, Datasets 4.0.0, and Tokenizers 0.21.4.
Good For
- Applications requiring robust natural language inference capabilities.
- Research and development in NLI tasks, particularly with a Llama-based foundation.