Ruckus-13b-Y: A Fine-Tuned Llama 2 Model
Ruckus-13b-Y is a 13 billion parameter language model developed by flytech, based on the robust meta-llama/Llama-2-13b-hf architecture. This model has undergone a fine-tuning process, though the specific dataset used for this training is currently unspecified.
Training Details
The fine-tuning procedure for Ruckus-13b-Y involved several key hyperparameters:
- Learning Rate: 0.0002
- Batch Size: 64 (for both training and evaluation)
- Optimizer: Adam with betas=(0.9, 0.999) and epsilon=1e-08
- LR Scheduler Type: Constant
- Epochs: 8
The training was conducted using modern machine learning frameworks, including Transformers 4.33.2, Pytorch 2.0.1+cu118, Datasets 2.14.5, and Tokenizers 0.13.3.
Current Limitations
Detailed information regarding the model's specific capabilities, intended uses, limitations, and the exact nature of its training and evaluation data is not yet available. Users should be aware that without further documentation, the unique strengths or optimal applications of Ruckus-13b-Y remain to be fully defined.