donoway/TinyStoriesV2_Llama-3.2-1B-cumpal99
The donoway/TinyStoriesV2_Llama-3.2-1B-cumpal99 model is a 1 billion parameter language model, fine-tuned from meta-llama/Llama-3.2-1B. This model was trained with a constant learning rate of 2e-05 over 100 epochs, utilizing a batch size of 32. While specific details on its primary differentiator and intended use cases are not provided, its base architecture suggests general language generation capabilities.
Loading preview...
Model Overview
The donoway/TinyStoriesV2_Llama-3.2-1B-cumpal99 is a 1 billion parameter language model, fine-tuned from the meta-llama/Llama-3.2-1B base architecture. This model was developed by donoway and represents a specialized iteration of the Llama 3.2 series.
Training Details
The fine-tuning process involved specific hyperparameters:
- Learning Rate: 2e-05
- Batch Size: 32 (training), 112 (evaluation)
- Epochs: 100
- Optimizer: ADAMW_TORCH with default betas and epsilon
- LR Scheduler: Constant with a warmup ratio of 1e-05
The training was conducted using Transformers 4.51.3, Pytorch 2.6.0+cu124, Datasets 3.5.0, and Tokenizers 0.21.1.
Key Characteristics
- Base Model: Fine-tuned from
meta-llama/Llama-3.2-1B. - Parameter Count: 1 billion parameters.
- Context Length: Supports a context length of 32768 tokens.
Limitations and Use Cases
Specific details regarding the model's intended uses, limitations, and the dataset used for fine-tuning are not explicitly provided in the available documentation. Users should exercise caution and conduct further evaluation to determine its suitability for particular applications, especially given the unknown nature of its training data.