ferrazzipietro/unsup-Llama-3.2-1B-Instruct-datav2

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:1BQuant:BF16Ctx Length:32kPublished:Feb 13, 2026License:llama3.2Architecture:Transformer Warm

The ferrazzipietro/unsup-Llama-3.2-1B-Instruct-datav2 is a 1 billion parameter instruction-tuned causal language model, fine-tuned by ferrazzipietro, based on the Meta Llama-3.2-1B-Instruct architecture. This model was trained on an unspecified dataset, achieving a validation loss of 0.2694. It is a compact model suitable for tasks requiring a smaller footprint, with a context length of 32768 tokens.

Loading preview...

Model Overview

The ferrazzipietro/unsup-Llama-3.2-1B-Instruct-datav2 is a 1 billion parameter instruction-tuned language model, developed by ferrazzipietro. It is built upon the meta-llama/Llama-3.2-1B-Instruct base architecture and features a substantial context length of 32768 tokens.

Training Details

This model was fine-tuned using an unknown dataset over 1 epoch. Key training hyperparameters included a learning rate of 0.0003, a train_batch_size of 128, and a gradient_accumulation_steps of 4, resulting in an effective total batch size of 512. The optimizer used was ADAMW_TORCH with a cosine learning rate scheduler and a warmup ratio of 0.1. During evaluation, the model achieved a final validation loss of 0.2694.

Intended Uses & Limitations

Due to the limited information provided in the original model card, specific intended uses and limitations are not detailed. Users should be aware that the training data is unspecified, which may impact its performance on particular tasks or domains. Further evaluation is recommended to determine its suitability for specific applications.