rbelanec/train_cola_42_1774791067

TEXT GENERATIONConcurrency Cost:1Model Size:1BQuant:BF16Ctx Length:32kPublished:Mar 29, 2026License:llama3.2Architecture:Transformer Cold

The rbelanec/train_cola_42_1774791067 model is a 1 billion parameter instruction-tuned causal language model, fine-tuned by rbelanec. It is based on the meta-llama/Llama-3.2-1B-Instruct architecture and specifically trained on the CoLA (Corpus of Linguistic Acceptability) dataset. This model is optimized for tasks related to linguistic acceptability judgments, demonstrating a validation loss of 0.2517 on the evaluation set.

Loading preview...

Model Overview

The rbelanec/train_cola_42_1774791067 model is a specialized 1 billion parameter language model. It is a fine-tuned variant of the meta-llama/Llama-3.2-1B-Instruct architecture, specifically adapted for linguistic tasks.

Key Capabilities

  • Linguistic Acceptability: The model has been fine-tuned on the CoLA (Corpus of Linguistic Acceptability) dataset, indicating its primary strength lies in evaluating the grammatical acceptability of English sentences.
  • Performance: Achieved a validation loss of 0.2517 on the evaluation set, with a total of 1,932,608 input tokens seen during training.

Training Details

The model was trained with the following key hyperparameters:

  • Base Model: meta-llama/Llama-3.2-1B-Instruct
  • Dataset: CoLA dataset
  • Learning Rate: 5e-05
  • Optimizer: ADAMW_TORCH
  • Epochs: 5
  • Batch Size: 8 (train and eval)

Intended Use Cases

This model is particularly suited for applications requiring an understanding of grammatical correctness and linguistic acceptability. Its fine-tuning on the CoLA dataset makes it a candidate for tasks such as:

  • Grammar checking and correction.
  • Evaluating sentence structures for natural language understanding systems.
  • Research into linguistic phenomena related to sentence well-formedness.