flytech/Ruckus-13b-27
Ruckus-13b-27 is a 13 billion parameter language model developed by flytech, fine-tuned from the Meta Llama-2-13b-hf architecture. This model is trained with a constant learning rate of 0.0002 over 12 epochs, utilizing Adam optimizer. Its specific primary differentiator and optimal use cases are not detailed in the available information.
Loading preview...
Ruckus-13b-27: A Fine-Tuned Llama-2 Model
This model, Ruckus-13b-27, is a fine-tuned variant of Meta's Llama-2-13b-hf architecture. Developed by flytech, it leverages the robust foundation of the 13 billion parameter Llama-2 base model.
Training Details
The model underwent training with specific hyperparameters:
- Learning Rate: 0.0002
- Batch Size: 16 (for both training and evaluation)
- Optimizer: Adam with default betas and epsilon
- LR Scheduler: Constant
- Epochs: 12
It was trained using Transformers 4.33.3, Pytorch 2.0.1+cu118, Datasets 2.14.5, and Tokenizers 0.13.3. The specific dataset used for fine-tuning is not disclosed, and further details regarding its intended uses, limitations, and performance metrics are currently unavailable.