rbelanec/train_mnli_42_1779207271

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:1BQuant:BF16Ctx Length:32kPublished:May 19, 2026License:llama3.2Architecture:Transformer Warm

The rbelanec/train_mnli_42_1779207271 model is a 1 billion parameter language model fine-tuned by rbelanec based on Meta Llama-3.2-1B-Instruct. This model is specifically fine-tuned on the MNLI (Multi-Genre Natural Language Inference) dataset, achieving a validation loss of 0.1060. Its primary strength lies in natural language inference tasks, making it suitable for understanding entailment, contradiction, and neutrality between sentence pairs.

Loading preview...

Overview

This model, rbelanec/train_mnli_42_1779207271, is a 1 billion parameter language model derived from the meta-llama/Llama-3.2-1B-Instruct architecture. It has been specifically fine-tuned on the MNLI (Multi-Genre Natural Language Inference) dataset to enhance its performance on natural language inference tasks.

Key Capabilities

  • Natural Language Inference: Optimized for determining the relationship (entailment, contradiction, or neutrality) between a premise and a hypothesis.
  • Fine-tuned Performance: Achieved a validation loss of 0.1060 on the evaluation set, indicating strong performance on the MNLI task.

Training Details

The model was trained for 5 epochs with a learning rate of 2e-06 and a batch size of 8. It utilized the AdamW optimizer with a cosine learning rate scheduler and a warmup ratio of 0.1. The training process involved processing over 191 million input tokens.

Good For

  • Applications requiring robust natural language inference capabilities.
  • Research and development in understanding semantic relationships between sentences.
  • Tasks where identifying entailment, contradiction, or neutrality is crucial.