flytech/Ruckus-13B-v20

TEXT GENERATIONConcurrency Cost:1Model Size:13BQuant:FP8Ctx Length:4kArchitecture:Transformer Cold

Ruckus-13B-v20 is a 13 billion parameter language model developed by flytech, fine-tuned from the Meta Llama-2-13b-hf architecture. This model is a fine-tuned variant of the Llama 2 series, offering a robust foundation for various natural language processing tasks. Its specific optimizations and primary use cases are not detailed in the available information, suggesting a general-purpose fine-tune.

Loading preview...

Ruckus-13B-v20 Overview

Ruckus-13B-v20 is a 13 billion parameter language model developed by flytech, based on the meta-llama/Llama-2-13b-hf architecture. This model represents a fine-tuned iteration of the Llama 2 series, though specific details regarding its training dataset and primary differentiators are not publicly available.

Training Details

The model was trained using the following hyperparameters:

  • Learning Rate: 0.0002
  • Batch Size: 16 (for both training and evaluation)
  • Optimizer: Adam with betas=(0.9, 0.999) and epsilon=1e-08
  • LR Scheduler Type: Constant
  • Epochs: 12

Framework Versions

The training process utilized:

  • Transformers 4.33.2
  • Pytorch 2.0.1+cu118
  • Datasets 2.14.5
  • Tokenizers 0.13.3

Intended Uses & Limitations

Due to the limited information provided, the specific intended uses and potential limitations of Ruckus-13B-v20 are not clearly defined. Users should conduct their own evaluations to determine suitability for particular applications.