Name: shuoxing/llama3-8b-full-pretrain-wash-c4-3-9m-bs4 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: shuoxing

Model Overview

This model, shuoxing/llama3-8b-full-pretrain-wash-c4-3-9m-bs4, is an 8 billion parameter language model developed by shuoxing. It is a fine-tuned variant of the shuoxing/llama3-8b-full-pretrain-junk-tweet-1m-en-reproduce-bs8 base model.

Key Characteristics

Base Model: Llama 3 8B architecture.
Fine-tuning Dataset: The model was fine-tuned using the c4_3_9m dataset.
Training Hyperparameters:
- Learning Rate: 1e-05
- Optimizer: ADAMW_TORCH with betas=(0.9, 0.999)
- Scheduler: cosine with 0.1 warmup steps
- Epochs: 3.0

Intended Use Cases

While specific intended uses and limitations are not fully detailed in the provided documentation, its fine-tuning on the c4_3_9m dataset suggests potential applications in general text generation, comprehension, and tasks benefiting from exposure to a broad web corpus. Further evaluation is needed to determine its optimal use cases and performance characteristics.

Overview

Model Overview

Key Characteristics

Intended Use Cases

Full Model Card (README)