Name: Sirawipa/tian-ft API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: Sirawipa

Model Overview

Sirawipa/tian-ft is a 0.6 billion parameter language model, fine-tuned from the sail/Sailor-0.5B base model. The fine-tuning process involved 10 epochs, utilizing a linear learning rate scheduler with a peak learning rate of 0.0002. The model achieved a validation loss of 0.3696 on its evaluation set.

Training Details

Training was conducted with a batch size of 4, accumulating gradients over 4 steps for an effective total batch size of 16. The Adam optimizer was used with standard betas and epsilon. Mixed-precision training (Native AMP) was enabled to optimize performance. The training process showed a consistent decrease in training loss, with validation loss stabilizing towards the end of the 10 epochs.

Limitations

The available documentation does not specify the dataset used for fine-tuning, nor does it detail the model's intended uses or specific limitations. Therefore, its optimal applications and unique capabilities compared to other models are currently undefined.

Overview

Model Overview

Training Details

Limitations

Full Model Card (README)