flytech/Ruckus-13B-v20
Ruckus-13B-v20 is a 13 billion parameter language model developed by flytech, fine-tuned from the Meta Llama-2-13b-hf architecture. This model is a fine-tuned variant of the Llama 2 series, offering a robust foundation for various natural language processing tasks. Its specific optimizations and primary use cases are not detailed in the available information, suggesting a general-purpose fine-tune.
Loading preview...
Ruckus-13B-v20 Overview
Ruckus-13B-v20 is a 13 billion parameter language model developed by flytech, based on the meta-llama/Llama-2-13b-hf architecture. This model represents a fine-tuned iteration of the Llama 2 series, though specific details regarding its training dataset and primary differentiators are not publicly available.
Training Details
The model was trained using the following hyperparameters:
- Learning Rate: 0.0002
- Batch Size: 16 (for both training and evaluation)
- Optimizer: Adam with betas=(0.9, 0.999) and epsilon=1e-08
- LR Scheduler Type: Constant
- Epochs: 12
Framework Versions
The training process utilized:
- Transformers 4.33.2
- Pytorch 2.0.1+cu118
- Datasets 2.14.5
- Tokenizers 0.13.3
Intended Uses & Limitations
Due to the limited information provided, the specific intended uses and potential limitations of Ruckus-13B-v20 are not clearly defined. Users should conduct their own evaluations to determine suitability for particular applications.