rbelanec/train_mnli_42_1779286678

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:1BQuant:BF16Ctx Length:32kPublished:May 20, 2026License:llama3.2Architecture:Transformer Warm

rbelanec/train_mnli_42_1779286678 is a 1 billion parameter language model fine-tuned from Meta Llama-3.2-1B-Instruct. This model is specifically optimized for natural language inference tasks, having been fine-tuned on the MNLI dataset. It demonstrates a validation loss of 0.1017 on the evaluation set, indicating its proficiency in classifying textual entailment relationships. Its primary strength lies in accurately determining logical relationships between sentence pairs.

Loading preview...

Overview

This model, rbelanec/train_mnli_42_1779286678, is a specialized 1 billion parameter language model. It is a fine-tuned variant of the meta-llama/Llama-3.2-1B-Instruct architecture, specifically adapted for natural language inference (NLI) tasks. The fine-tuning process utilized the Multi-Genre Natural Language Inference (MNLI) dataset, which focuses on determining entailment, contradiction, or neutrality between pairs of sentences.

Key Capabilities

  • Natural Language Inference: Excels at classifying the relationship between a premise and a hypothesis as entailment, contradiction, or neutral.
  • Fine-tuned Performance: Achieved a validation loss of 0.1017 on the MNLI evaluation set, indicating strong performance in its specialized domain.
  • Efficient Size: As a 1 billion parameter model, it offers a balance between performance on NLI tasks and computational efficiency.

Training Details

The model was trained with a learning rate of 2e-06, a batch size of 8, and for 1 epoch. The training involved processing approximately 38 million input tokens. The optimizer used was ADAMW_TORCH with a cosine learning rate scheduler.

Should I use this for my use case?

This model is highly suitable for applications requiring precise natural language inference, such as fact-checking, semantic search, or dialogue systems where understanding logical relationships between statements is crucial. Its specialization on the MNLI dataset makes it a strong candidate for tasks directly related to textual entailment. However, for general-purpose text generation or tasks outside of NLI, other instruction-tuned models might be more appropriate.