jysssacc/TinyLlama-1.1B-intermediate-step-1195k-token-2.5T_huth_fine_lr5e-05_bs2_epoch5_wd0.01
The jysssacc/TinyLlama-1.1B-intermediate-step-1195k-token-2.5T_huth_fine_lr5e-05_bs2_epoch5_wd0.01 model is a 1.1 billion parameter language model, fine-tuned from TinyLlama's intermediate step model. This model was trained with a learning rate of 5e-05 and a batch size of 2 over 5 epochs. It is a specialized iteration of the TinyLlama architecture, focusing on further adaptation through fine-tuning on an unspecified dataset.
Loading preview...
Model Overview
This model, jysssacc/TinyLlama-1.1B-intermediate-step-1195k-token-2.5T_huth_fine_lr5e-05_bs2_epoch5_wd0.01, is a fine-tuned variant of the TinyLlama-1.1B-intermediate-step-1195k-token-2.5T base model. It leverages the compact 1.1 billion parameter architecture of TinyLlama, which is known for its efficiency and suitability for resource-constrained environments.
Training Details
The fine-tuning process involved specific hyperparameters aimed at adapting the model further. Key training parameters include:
- Learning Rate: 5e-05
- Batch Size: 2 (for both training and evaluation)
- Epochs: 5
- Optimizer: Adam with default betas and epsilon
- Scheduler: Linear learning rate scheduler with 500 warmup steps
This fine-tuning was performed on an unspecified dataset, suggesting a specialized application or domain adaptation. The model was developed using Transformers 4.36.2, Pytorch 2.0.1, Datasets 2.16.1, and Tokenizers 0.15.0.
Intended Use
While specific intended uses and limitations are not detailed in the provided information, as a fine-tuned version of TinyLlama, it is likely suitable for tasks requiring a compact yet capable language model, potentially in the domain of its fine-tuning dataset. Users should evaluate its performance for their specific use cases, especially given the unknown nature of the fine-tuning data.