RefalMachine/llm_test_raw

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kArchitecture:Transformer Cold

RefalMachine/llm_test_raw is a 7 billion parameter language model fine-tuned from TheBloke/Llama-2-7B-fp16. This model was trained for one epoch, achieving a validation loss of 2.0955 and an accuracy of 0.5405 on its evaluation set. Its primary characteristic is its foundational Llama-2 architecture, with specific performance metrics provided from its fine-tuning process.

Loading preview...

Overview

RefalMachine/llm_test_raw is a 7 billion parameter language model derived from TheBloke's Llama-2-7B-fp16. It underwent a single epoch of fine-tuning, resulting in a validation loss of 2.0955 and an accuracy of 0.5405 on the evaluation dataset. The training utilized specific hyperparameters including a learning rate of 0.0003, a batch size of 12, and an Adam optimizer with betas=(0.9, 0.95).

Key Training Details

  • Base Model: TheBloke/Llama-2-7B-fp16
  • Parameters: 7 Billion
  • Epochs: 1.0
  • Final Validation Loss: 2.0955
  • Final Accuracy: 0.5405
  • Learning Rate: 0.0003
  • Optimizer: Adam with betas=(0.9, 0.95) and epsilon=1e-05
  • Batch Size: 12 (train and eval), total train batch size of 336 with gradient accumulation

Intended Uses & Limitations

The README indicates that more information is needed regarding the model's intended uses, limitations, and the specific training and evaluation data employed. Users should be aware that the provided accuracy and loss metrics are specific to the fine-tuning process described and may not generalize broadly without further context on the dataset used.