Overview
RefalMachine/llm_test_raw is a 7 billion parameter language model derived from TheBloke's Llama-2-7B-fp16. It underwent a single epoch of fine-tuning, resulting in a validation loss of 2.0955 and an accuracy of 0.5405 on the evaluation dataset. The training utilized specific hyperparameters including a learning rate of 0.0003, a batch size of 12, and an Adam optimizer with betas=(0.9, 0.95).
Key Training Details
- Base Model: TheBloke/Llama-2-7B-fp16
- Parameters: 7 Billion
- Epochs: 1.0
- Final Validation Loss: 2.0955
- Final Accuracy: 0.5405
- Learning Rate: 0.0003
- Optimizer: Adam with betas=(0.9, 0.95) and epsilon=1e-05
- Batch Size: 12 (train and eval), total train batch size of 336 with gradient accumulation
Intended Uses & Limitations
The README indicates that more information is needed regarding the model's intended uses, limitations, and the specific training and evaluation data employed. Users should be aware that the provided accuracy and loss metrics are specific to the fine-tuning process described and may not generalize broadly without further context on the dataset used.