Name: shuoxing/llama3-8b-full-pretrain-wash-c4-0-3m-bs4 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: shuoxing

Model Overview

This model, shuoxing/llama3-8b-full-pretrain-wash-c4-0-3m-bs4, is an 8 billion parameter language model built upon the Llama 3 architecture. It represents a fine-tuned iteration of the shuoxing/llama3-8b-full-pretrain-junk-tweet-1m-en-reproduce-bs8 model.

Key Training Details

Base Model: Fine-tuned from shuoxing/llama3-8b-full-pretrain-junk-tweet-1m-en-reproduce-bs8.
Dataset: Training was conducted on the c4_0_3m dataset, suggesting a focus on general web text understanding and generation.
Hyperparameters: Key training parameters included a learning rate of 1e-05, a total batch size of 4 (across 4 devices), and 3 epochs of training using a cosine learning rate scheduler.

Potential Use Cases

Given its foundation and training on the C4 dataset, this model is likely suitable for:

General text generation and completion.
Understanding and processing diverse web-based content.
As a base for further fine-tuning on more specific downstream tasks.

Overview

Model Overview

Key Training Details

Potential Use Cases

Full Model Card (README)