quriousclick/tinyllama-v1-training
The quriousclick/tinyllama-v1-training model is a fine-tuned version of TinyLlama/TinyLlama-1.1B-Chat-v1.0, developed by quriousclick. This model is based on the TinyLlama architecture, which is a compact 1.1 billion parameter language model. It was trained with a learning rate of 0.0002 over 250 steps using Adam optimizer and cosine learning rate scheduler. Further details on its specific capabilities and intended uses are not provided in the available documentation.
Loading preview...
Overview
The quriousclick/tinyllama-v1-training model is a fine-tuned iteration of the TinyLlama/TinyLlama-1.1B-Chat-v1.0 base model. Developed by quriousclick, this model leverages the compact 1.1 billion parameter TinyLlama architecture, which is known for its efficiency.
Training Details
The model underwent a fine-tuning process with the following key hyperparameters:
- Learning Rate: 0.0002
- Optimizer: Adam with betas=(0.9, 0.999) and epsilon=1e-08
- Scheduler: Cosine learning rate scheduler
- Training Steps: 250
- Batch Size: A
train_batch_sizeof 16, withgradient_accumulation_stepsof 4, resulting in atotal_train_batch_sizeof 64. - Mixed Precision: Native AMP was utilized for training efficiency.
Limitations
Detailed information regarding the specific dataset used for fine-tuning, the model's intended uses, its limitations, and evaluation data is not provided in the current documentation. Users should exercise caution and conduct their own evaluations for specific applications.