rbelanec/train_qnli_42_1779207272

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:1BQuant:BF16Ctx Length:32kPublished:May 19, 2026License:llama3.2Architecture:Transformer Warm

The rbelanec/train_qnli_42_1779207272 model is a 1 billion parameter language model fine-tuned from meta-llama/Llama-3.2-1B-Instruct. It is specifically optimized for Question-answering NLI (QNLI) tasks, demonstrating a validation loss of 0.0535 on the evaluation set. This model is designed for natural language inference applications, particularly those requiring accurate question-answering capabilities.

Loading preview...

Model Overview

The rbelanec/train_qnli_42_1779207272 model is a fine-tuned version of the meta-llama/Llama-3.2-1B-Instruct architecture, comprising 1 billion parameters. It has been specifically trained on the QNLI (Question-answering Natural Language Inference) dataset.

Key Capabilities

  • Natural Language Inference (NLI): Optimized for determining the relationship between a question and a candidate answer.
  • Question Answering: Excels in tasks where the model needs to infer an answer based on provided context.
  • Performance: Achieved a validation loss of 0.0535 on the evaluation set during training, with a total of 56,574,368 input tokens seen.

Training Details

The model was trained using the following hyperparameters:

  • Learning Rate: 2e-06
  • Optimizer: ADAMW_TORCH
  • Epochs: 5
  • Batch Size: 8 (train and eval)
  • Scheduler: Cosine with 0.1 warmup ratio

Intended Use Cases

This model is suitable for applications requiring robust natural language inference, particularly in question-answering systems where determining entailment or contradiction between statements is crucial. Its fine-tuning on the QNLI dataset makes it a specialized tool for such tasks.