abeiler/NumAndAlphaInstruct-25-75

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kArchitecture:Transformer Cold

The abeiler/NumAndAlphaInstruct-25-75 model is a fine-tuned version of Meta's Llama-2-7b-hf, developed by abeiler. This model is based on the Llama 2 architecture, featuring 7 billion parameters. It was fine-tuned using QLORA, though specific details about its primary differentiators, intended uses, and training data are not yet available. Further information is needed to determine its specialized capabilities or optimal use cases.

Loading preview...

Model Overview

The abeiler/NumAndAlphaInstruct-25-75 model is a fine-tuned variant of the meta-llama/Llama-2-7b-hf base model. Developed by abeiler, this model leverages the Llama 2 architecture, which typically features 7 billion parameters for the 7b version.

Training Details

The model was fine-tuned using the QLORA method. The training process involved the following hyperparameters:

  • Learning Rate: 0.0001
  • Batch Sizes: train_batch_size of 4, eval_batch_size of 8
  • Optimizer: Adam with betas=(0.9, 0.999) and epsilon=1e-08
  • LR Scheduler: Linear type
  • Epochs: 1

It was trained using Transformers 4.33.3, Pytorch 2.0.0, Datasets 2.12.0, and Tokenizers 0.13.3.

Current Limitations

As of now, detailed information regarding the specific dataset used for fine-tuning, the model's intended uses, its unique capabilities, or any known limitations is not available. Users should be aware that further documentation is required to fully understand its performance characteristics and optimal applications.