Name: ZhangShenao/baseline-Llama-3-8B-Instruct-sft API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: ZhangShenao

Model Overview

This model, baseline-Llama-3-8B-Instruct-sft, is an 8 billion parameter language model developed by ZhangShenao. It is a fine-tuned version of the meta-llama/Meta-Llama-3-8B-Instruct base model, specifically adapted through supervised fine-tuning (SFT).

Key Characteristics

Base Model: Meta-Llama-3-8B-Instruct
Parameter Count: 8 billion parameters
Context Length: 8192 tokens
Fine-tuning: Supervised fine-tuning (SFT) on a generator dataset.

Training Details

The model was trained using the following key hyperparameters:

Learning Rate: 2e-05
Batch Size: A total training batch size of 128 (train_batch_size: 4, gradient_accumulation_steps: 4, num_devices: 8)
Optimizer: Adam with betas=(0.9, 0.999) and epsilon=1e-08
Epochs: 3
Scheduler: Cosine learning rate scheduler with a warmup ratio of 0.1

Intended Use

Given its fine-tuning on a generator dataset, this model is primarily intended for text generation tasks. Specific use cases would depend on the nature of the generator dataset used for SFT, which is not detailed in the provided information.

Overview

Model Overview

Key Characteristics

Training Details

Intended Use

Full Model Card (README)