Name: afrilang/llama3-8b-full-sft API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: afrilang

Model Overview

afrilang/llama3-8b-full-sft is an 8 billion parameter language model, fine-tuned from the powerful Meta-Llama-3-8B-Instruct base model. This adaptation was performed using the afrilang_sft dataset, suggesting a specialization in tasks or languages relevant to the 'afrilang' context.

Key Training Details

The model underwent supervised fine-tuning (SFT) with the following notable hyperparameters:

Base Model: meta-llama/Meta-Llama-3-8B-Instruct
Learning Rate: 1e-05
Batch Size: A total training batch size of 16 (with 1 per device and 8 gradient accumulation steps).
Optimizer: ADAMW_TORCH_FUSED with standard betas and epsilon.
Scheduler: Cosine learning rate scheduler with a 0.1 warmup ratio.
Epochs: Trained for 3.0 epochs.

Intended Use Cases

While specific intended uses and limitations are not detailed in the original README, the fine-tuning on the afrilang_sft dataset implies that this model is likely optimized for:

Language-specific tasks: Potentially for African languages or tasks requiring understanding of specific cultural or linguistic nuances.
Instruction-following in specialized domains: Leveraging the instruction-tuned base model, it should excel at following commands within its fine-tuned domain.

Users should be aware that the full scope of its capabilities and limitations would require further evaluation, especially concerning its performance on general tasks versus its specialized domain.

Overview

Model Overview

Key Training Details

Intended Use Cases

Full Model Card (README)