Run stackoverflow_5000tasks_.25p API | Serverless Inference | 32K Context

Name: mlfoundations-dev/stackoverflow_5000tasks_.25p API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: mlfoundations-dev

Model Overview

The mlfoundations-dev/stackoverflow_5000tasks_.25p model is an 8 billion parameter language model derived from Meta-Llama-3.1-8B. It has undergone fine-tuning on the mlfoundations-dev/stackoverflow_5000tasks_.25p dataset, demonstrating a validation loss of 0.5831 after 3 epochs of training.

Training Details

The model was trained using the following key hyperparameters:

Learning Rate: 5e-06
Batch Size: 8 (per device), resulting in a total effective batch size of 512 across 8 GPUs with gradient accumulation.
Optimizer: ADAMW_TORCH with default betas and epsilon.
Epochs: 3.0

Performance

During training, the model's loss progressively decreased:

Epoch 1: Validation Loss 0.6395
Epoch 2: Validation Loss 0.6017
Epoch 3: Validation Loss 0.5831

Intended Use Cases

This model is specifically fine-tuned on a Stack Overflow-related dataset, suggesting its primary utility for tasks involving technical questions, code snippets, and discussions commonly found on the Stack Overflow platform. Developers and researchers working with programming-related natural language processing tasks may find this model particularly relevant.

Overview

Model Overview

Training Details

Performance

Intended Use Cases

Full Model Card (README)