Name: mlfoundations-dev/llama3-1_8b_r1_annotated_aops API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: mlfoundations-dev

Model Overview

This model, llama3-1_8b_r1_annotated_aops, is a fine-tuned variant of the Meta-Llama-3.1-8B architecture, developed by mlfoundations-dev. It comprises approximately 7.6 billion parameters and was trained with a context length of 131072 tokens. The fine-tuning process utilized the mlfoundations-dev/r1_annotated_aops dataset, resulting in a final validation loss of 0.6034.

Training Details

The model underwent 3 epochs of training with a learning rate of 5e-06 and a total batch size of 512 across 32 GPUs. The optimizer used was ADAMW_TORCH with default betas and epsilon, and a constant learning rate scheduler. Key training results include a progressive reduction in validation loss from 0.6528 in epoch 1 to 0.6034 in epoch 3.

Potential Use Cases

Given its fine-tuning on the r1_annotated_aops dataset, this model is likely best suited for applications and research directly related to the characteristics and content of that specific dataset. Developers should evaluate its performance on tasks aligned with the dataset's domain.

Overview

Model Overview

Training Details

Potential Use Cases

Full Model Card (README)