abeiler/NumAndAlphaInstruct-75-25-500K

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kArchitecture:Transformer Cold

The abeiler/NumAndAlphaInstruct-75-25-500K is a 7 billion parameter instruction-tuned model, fine-tuned from meta-llama/Llama-2-7b-hf. This model was trained with a learning rate of 0.0001 over 1 epoch, utilizing Adam optimizer. Its specific differentiators and primary use cases are not detailed in the provided information.

Loading preview...

Model Overview

The abeiler/NumAndAlphaInstruct-75-25-500K is a 7 billion parameter language model, fine-tuned from the meta-llama/Llama-2-7b-hf base architecture. The fine-tuning process involved a single epoch with a learning rate of 0.0001, using an Adam optimizer with specific beta and epsilon values. The training was conducted with a batch size of 4 and an evaluation batch size of 8, with a seed of 42.

Training Details

  • Base Model: meta-llama/Llama-2-7b-hf
  • Learning Rate: 0.0001
  • Optimizer: Adam (betas=(0.9, 0.999), epsilon=1e-08)
  • Epochs: 1
  • Batch Sizes: Train: 4, Eval: 8
  • Frameworks: Transformers 4.33.3, Pytorch 2.0.0, Datasets 2.12.0, Tokenizers 0.13.3

Current Limitations

The model card indicates that more information is needed regarding its specific intended uses, limitations, and the dataset used for training and evaluation. Therefore, its unique capabilities or optimal use cases are not explicitly defined.