rbelanec/train_mnli_42_1775732963

TEXT GENERATIONConcurrency Cost:1Model Size:1BQuant:BF16Ctx Length:32kPublished:Apr 9, 2026License:llama3.2Architecture:Transformer Cold

The rbelanec/train_mnli_42_1775732963 model is a 1 billion parameter language model fine-tuned from meta-llama/Llama-3.2-1B-Instruct. It is specifically optimized for natural language inference tasks, having been trained on the MNLI dataset. This model demonstrates a validation loss of 0.1219 on the evaluation set, indicating its proficiency in determining entailment, contradiction, or neutrality between sentence pairs.

Loading preview...

Model Overview

This model, rbelanec/train_mnli_42_1775732963, is a 1 billion parameter language model derived from the meta-llama/Llama-3.2-1B-Instruct architecture. It has undergone specific fine-tuning on the MNLI (Multi-Genre Natural Language Inference) dataset, making it specialized for natural language inference tasks.

Key Capabilities

  • Natural Language Inference (NLI): The model is designed to classify the relationship between a premise and a hypothesis as entailment, contradiction, or neutral.
  • Performance: Achieved a validation loss of 0.1219 on the evaluation set after 1.0 epoch of training, with a total of 191,491,960 input tokens seen during the entire training process.

Training Details

The model was trained with a learning rate of 5e-06, using AdamW_Torch optimizer, and a cosine learning rate scheduler with a 0.1 warmup ratio over 5 epochs. Batch sizes for both training and evaluation were set to 8. The training process utilized Transformers 4.51.3, Pytorch 2.10.0+cu128, Datasets 4.0.0, and Tokenizers 0.21.4.

Good For

  • Applications requiring robust natural language inference capabilities.
  • Research and development in NLI tasks, particularly with a Llama-based foundation.