Name: ramzanniaz331/llama3-8b-full-sft-v3 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: ramzanniaz331

Overview

This model, ramzanniaz331/llama3-8b-full-sft-v3, is an 8 billion parameter language model developed by ramzanniaz331. It is a supervised fine-tuned (SFT) version of the ramzanniaz331/llama3.1-8b-8192-v3 base model, indicating an enhancement for specific conversational or instructional tasks. The fine-tuning process involved a diverse collection of datasets, suggesting a broad range of potential applications.

Key Training Details

The model was fine-tuned using a learning rate of 5e-06 over 2 epochs, with a total training batch size of 64 across 8 GPUs. It utilized the AdamW_Torch_Fused optimizer and a cosine learning rate scheduler with a 0.03 warmup ratio. The training was conducted using Transformers 4.57.1, Pytorch 2.9.1+cu128, Datasets 4.0.0, and Tokenizers 0.22.1.

Datasets Used for Fine-tuning

The fine-tuning process incorporated several distinct datasets:

ramzan_5k_batch_1
ramzan_5k_batch_2
ramzan_openhermes
ramzan_metamath
ramzan_aya_urdu

This diverse dataset selection implies an intent to improve the model's performance across various domains, potentially including general conversation, mathematical reasoning, and possibly multilingual capabilities (given the ramzan_aya_urdu dataset).

Overview

Overview

Key Training Details

Datasets Used for Fine-tuning

Full Model Card (README)