Model Overview
The layai/syn-arxiv-dict model is an 8 billion parameter language model, fine-tuned from the meta-llama/Meta-Llama-3-8B base architecture. The fine-tuning process involved an unknown dataset, and the model achieved a validation accuracy of 0.7009 with a final validation loss of 2.2990.
Training Details
The model was trained using the following key hyperparameters:
- Learning Rate: 5e-05
- Batch Size: 40 (train and eval), with a total effective batch size of 160 due to gradient accumulation steps (4).
- Optimizer: Adam with default betas and epsilon.
- LR Scheduler: Cosine type.
- Epochs: 3.0
Performance Metrics
During training, the model's performance evolved as follows:
- Epoch 0.87: Validation Loss: 2.3493, Accuracy: 0.6701
- Epoch 1.74: Validation Loss: 2.2990, Accuracy: 0.7009
- Epoch 2.61: Validation Loss: 2.4136, Accuracy: 0.7041
Intended Uses & Limitations
Specific intended uses and limitations are not detailed in the provided information. Developers should conduct further evaluation to determine suitability for particular applications.