Name: tsavage68/Summary_L3_1000steps_1e7rate_SFT2 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: tsavage68

Model Overview

The tsavage68/Summary_L3_1000steps_1e7rate_SFT2 is an 8 billion parameter language model, fine-tuned from the meta-llama/Meta-Llama-3-8B-Instruct base model. This iteration was developed through a supervised fine-tuning (SFT) process, although the specific dataset used for this fine-tuning is not disclosed in the available documentation.

Training Details

The model underwent 1000 training steps, utilizing a learning rate of 1e-07 and an Adam optimizer. Key training hyperparameters included a train_batch_size of 2, gradient_accumulation_steps of 2, and a total_train_batch_size of 4. The training process achieved a final validation loss of 1.5908, indicating its performance on the evaluation set. The training leveraged Transformers 4.41.2, Pytorch 2.0.0+cu117, Datasets 2.19.2, and Tokenizers 0.19.1.

Key Characteristics

Base Model: Meta-Llama-3-8B-Instruct
Parameter Count: 8 Billion
Training Steps: 1000
Final Validation Loss: 1.5908

Intended Use Cases

Due to the lack of specific information regarding the fine-tuning dataset and intended uses, the model's optimal applications are not explicitly defined. Users should consider its Llama 3 base and the fine-tuning process when evaluating its suitability for specific tasks, particularly those requiring instruction-following capabilities.

Overview

Model Overview

Training Details

Key Characteristics

Intended Use Cases

Full Model Card (README)