rbelanec/train_qnli_42_1779207272
The rbelanec/train_qnli_42_1779207272 model is a 1 billion parameter language model fine-tuned from meta-llama/Llama-3.2-1B-Instruct. It is specifically optimized for Question-answering NLI (QNLI) tasks, demonstrating a validation loss of 0.0535 on the evaluation set. This model is designed for natural language inference applications, particularly those requiring accurate question-answering capabilities.
Loading preview...
Model Overview
The rbelanec/train_qnli_42_1779207272 model is a fine-tuned version of the meta-llama/Llama-3.2-1B-Instruct architecture, comprising 1 billion parameters. It has been specifically trained on the QNLI (Question-answering Natural Language Inference) dataset.
Key Capabilities
- Natural Language Inference (NLI): Optimized for determining the relationship between a question and a candidate answer.
- Question Answering: Excels in tasks where the model needs to infer an answer based on provided context.
- Performance: Achieved a validation loss of 0.0535 on the evaluation set during training, with a total of 56,574,368 input tokens seen.
Training Details
The model was trained using the following hyperparameters:
- Learning Rate: 2e-06
- Optimizer: ADAMW_TORCH
- Epochs: 5
- Batch Size: 8 (train and eval)
- Scheduler: Cosine with 0.1 warmup ratio
Intended Use Cases
This model is suitable for applications requiring robust natural language inference, particularly in question-answering systems where determining entailment or contradiction between statements is crucial. Its fine-tuning on the QNLI dataset makes it a specialized tool for such tasks.