Name: ramzanniaz331/llama3.1-8b-8192-v3 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: ramzanniaz331

Model Overview

ramzanniaz331/llama3.1-8b-8192-v3 is an 8 billion parameter language model, building upon the robust meta-llama/Llama-3.1-8B base architecture. This version has undergone further fine-tuning on a specific set of datasets: cpt_jazz_v3, cpt_jazz_v3_copy, cpt_opensource_v3, and cpt_local_v3. The training process aimed at continued pre-training, achieving a final validation loss of 1.1027.

Key Characteristics

Base Model: Fine-tuned from Meta's Llama-3.1-8B.
Parameter Count: 8 billion parameters.
Training Objective: Continued pre-training on specialized datasets.
Performance: Achieved a training loss of 1.1027 on the evaluation set.

Training Details

The model was trained with a learning rate of 5e-05, a total batch size of 256 (achieved with 1 train_batch_size and 32 gradient_accumulation_steps across 8 GPUs), and a cosine learning rate scheduler with a 0.03 warmup ratio over 1 epoch. The optimizer used was ADAMW_TORCH_FUSED.

Intended Use Cases

Given its continued pre-training nature, this model is likely suitable for tasks requiring strong language understanding and generation capabilities within the domains covered by its training data. It could serve as a foundation for further fine-tuning on specific downstream tasks or for research into the effects of its particular training datasets.

Overview

Model Overview

Key Characteristics

Training Details

Intended Use Cases

Full Model Card (README)