rbelanec/train_record_42_1776331412
The rbelanec/train_record_42_1776331412 model is a 1 billion parameter instruction-tuned causal language model, fine-tuned by rbelanec from the meta-llama/Llama-3.2-1B-Instruct architecture. This model has a context length of 32768 tokens and was specifically trained on the 'record' dataset. It achieved a validation loss of 0.4481 during training, indicating its specialization for tasks related to the 'record' dataset.
Loading preview...
Model Overview
The rbelanec/train_record_42_1776331412 model is a 1 billion parameter instruction-tuned language model, fine-tuned by rbelanec. It is based on the meta-llama/Llama-3.2-1B-Instruct architecture and has a context length of 32768 tokens. The model was specifically trained on the 'record' dataset.
Training Details
The model underwent 5 epochs of training with a learning rate of 5e-06 and a batch size of 8. An AdamW optimizer with cosine learning rate scheduler and 0.1 warmup ratio was used. During training, the model achieved a final validation loss of 0.4481 after processing over 98 million input tokens. The training process utilized Transformers 4.51.3, Pytorch 2.10.0+cu128, Datasets 4.0.0, and Tokenizers 0.21.4.
Performance
On the evaluation set, the model demonstrated a loss of 0.4481, indicating its performance on the specific 'record' dataset it was fine-tuned on.