layai/syn-arxiv-vanilla
The layai/syn-arxiv-vanilla model is an 8 billion parameter language model fine-tuned from Meta's Llama-3-8B architecture. Developed by layai, this model demonstrates a validation accuracy of 0.7198 on its evaluation set. While specific primary differentiators and intended use cases are not detailed, its training on an unknown dataset suggests a specialized application. It is suitable for tasks requiring a compact yet capable Llama-3-based model.
Loading preview...
Model Overview
layai/syn-arxiv-vanilla is an 8 billion parameter language model, fine-tuned from the meta-llama/Meta-Llama-3-8B base architecture. The model was trained by layai using specific hyperparameters including a learning rate of 5e-05, a total batch size of 160, and 3 epochs.
Performance Metrics
During training, the model achieved a validation accuracy of 0.7198 with a validation loss of 2.2882 after 1500 steps. The training utilized Adam optimizer with standard betas and epsilon, and a cosine learning rate scheduler.
Training Environment
The training process leveraged Transformers 4.45.0.dev0, Pytorch 2.3.1+cu121, Datasets 2.19.1, and Tokenizers 0.19.1.