nongfuyulang/engineer-heavy-500k-barc-llama3.1-8b-ins-fft-induction_lr1e-5_epoch3
The nongfuyulang/engineer-heavy-500k-barc-llama3.1-8b-ins-fft-induction_lr1e-5_epoch3 model is a fine-tuned variant of Meta-Llama-3.1-8B-Instruct, developed by nongfuyulang. This 8 billion parameter instruction-tuned model was trained for 2 epochs with a learning rate of 1e-05. It is optimized for tasks related to its unspecified training dataset, achieving a validation loss of 0.2710.
Loading preview...
Model Overview
This model, engineer-heavy-500k-barc-llama3.1-8b-ins-fft-induction_lr1e-5_epoch3, is a fine-tuned version of the Meta-Llama-3.1-8B-Instruct base model. It was developed by nongfuyulang and underwent a specific training regimen to adapt it for particular applications.
Training Details
The model was fine-tuned over 2 epochs using a learning rate of 1e-05. Key training hyperparameters included:
- Optimizer: Adam with betas=(0.9, 0.999) and epsilon=1e-08
- Batch Size: 16 (train and eval), with a total distributed batch size of 128 across 8 GPUs
- LR Scheduler: Cosine type with a warmup ratio of 0.1
Performance
During training, the model achieved a final validation loss of 0.2710. The training loss decreased from 0.2797 in the first epoch to 0.2389 in the second epoch.
Intended Use
Specific intended uses and limitations are not detailed in the provided information, suggesting further evaluation or documentation is needed to determine optimal applications.