rbelanec/train_sst2_42_1779207274

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:1BQuant:BF16Ctx Length:32kPublished:May 19, 2026License:llama3.2Architecture:Transformer Warm

The rbelanec/train_sst2_42_1779207274 model is a 1 billion parameter Llama-3.2-1B-Instruct variant, fine-tuned by rbelanec on the sst2 dataset. This model is specifically optimized for sentiment analysis tasks, demonstrating a validation loss of 0.0970. It is designed for applications requiring efficient and accurate sentiment classification.

Loading preview...

Overview

This model, rbelanec/train_sst2_42_1779207274, is a fine-tuned version of the meta-llama/Llama-3.2-1B-Instruct architecture, featuring 1 billion parameters and a context length of 32768 tokens. It has been specifically adapted for sentiment analysis by training on the sst2 dataset.

Key Capabilities

  • Sentiment Analysis: Optimized for classifying sentiment, as evidenced by its fine-tuning on the sst2 dataset.
  • Llama-3.2 Base: Benefits from the foundational capabilities of the Llama-3.2-1B-Instruct model.
  • Efficient Performance: Achieved a validation loss of 0.0970 during training, indicating strong performance on its target task.

Training Details

The model was trained with a learning rate of 2e-06, a batch size of 8, and for 5 epochs. It utilized the AdamW optimizer with a cosine learning rate scheduler and a warmup ratio of 0.1. The training process involved processing 18,647,328 input tokens, resulting in a final validation loss of 0.0970.

Good For

  • Sentiment Classification: Ideal for applications requiring the determination of sentiment from text inputs.
  • Resource-Constrained Environments: Its 1 billion parameter size makes it suitable for deployment where computational resources are a consideration, while still offering specialized performance.