flytech/Ruckus-13b-27

TEXT GENERATIONConcurrency Cost:1Model Size:13BQuant:FP8Ctx Length:4kArchitecture:Transformer Cold

Ruckus-13b-27 is a 13 billion parameter language model developed by flytech, fine-tuned from the Meta Llama-2-13b-hf architecture. This model is trained with a constant learning rate of 0.0002 over 12 epochs, utilizing Adam optimizer. Its specific primary differentiator and optimal use cases are not detailed in the available information.

Loading preview...

Ruckus-13b-27: A Fine-Tuned Llama-2 Model

This model, Ruckus-13b-27, is a fine-tuned variant of Meta's Llama-2-13b-hf architecture. Developed by flytech, it leverages the robust foundation of the 13 billion parameter Llama-2 base model.

Training Details

The model underwent training with specific hyperparameters:

  • Learning Rate: 0.0002
  • Batch Size: 16 (for both training and evaluation)
  • Optimizer: Adam with default betas and epsilon
  • LR Scheduler: Constant
  • Epochs: 12

It was trained using Transformers 4.33.3, Pytorch 2.0.1+cu118, Datasets 2.14.5, and Tokenizers 0.13.3. The specific dataset used for fine-tuning is not disclosed, and further details regarding its intended uses, limitations, and performance metrics are currently unavailable.