Name: tsavage68/chat_600STEPS_1e8rate_SFT API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: tsavage68

Model Overview

The tsavage68/chat_600STEPS_1e8rate_SFT is a 7 billion parameter language model, fine-tuned from the meta-llama/Llama-2-7b-chat-hf base architecture. The model underwent a supervised fine-tuning (SFT) process over 600 training steps, utilizing a learning rate of 1e-08.

Training Details

Key hyperparameters used during training include:

Learning Rate: 1e-08
Batch Size: 4 (train), 1 (eval)
Gradient Accumulation Steps: 2 (resulting in a total effective batch size of 8)
Optimizer: Adam with default betas and epsilon
LR Scheduler: Cosine type with 100 warmup steps
Total Training Steps: 600

The training process concluded with a validation loss of 1.6169. The model was developed using Transformers 4.37.2, Pytorch 2.0.0+cu117, Datasets 2.17.0, and Tokenizers 0.15.2.

Current Limitations

As per the provided documentation, specific details regarding the training dataset, intended uses, and further limitations are not yet available. Users should exercise caution and conduct further evaluation for specific applications.

Overview

Model Overview

Training Details

Current Limitations

Full Model Card (README)