Name: tsavage68/chat_1000STEPS_1e5rate_SFT_SFT API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: tsavage68

Overview

This model, tsavage68/chat_1000STEPS_1e5rate_SFT_SFT, is a 7 billion parameter language model derived from the Meta Llama-2-7b-chat-hf architecture. It has undergone supervised fine-tuning (SFT) over 1000 training steps.

Training Details

The model was trained using specific hyperparameters, including a learning rate of 1e-05, a train_batch_size of 4, and gradient_accumulation_steps of 2, resulting in a total effective batch size of 8. The optimizer used was Adam with standard betas and epsilon, and a cosine learning rate scheduler with 100 warmup steps. Over 1000 steps, the model achieved a final validation loss of 0.2871.

Key Characteristics

Base Model: Fine-tuned from meta-llama/Llama-2-7b-chat-hf.
Parameter Count: 7 billion parameters.
Training Steps: 1000 steps of supervised fine-tuning.
Validation Loss: Achieved 0.2871 on the evaluation set.

Intended Use

Due to the limited information provided in the original model card regarding its specific training dataset and intended applications, its precise use cases are not explicitly defined. Developers should conduct further evaluation to determine suitability for specific tasks, particularly given its fine-tuned nature from a chat-optimized base model.

Overview

Overview

Training Details

Key Characteristics

Intended Use

Full Model Card (README)