Name: mlfoundations-dev/llama3-1_8b_4o_annotated_olympiads API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: mlfoundations-dev

Model Overview

This model, llama3-1_8b_4o_annotated_olympiads, is a 7.6 billion parameter language model developed by mlfoundations-dev. It is a fine-tuned variant of the Qwen/Qwen2.5-7B-Instruct architecture, specifically adapted using the mlfoundations-dev/4o_annotated_olympiads dataset. The model supports a substantial context length of 131072 tokens, enabling it to process extensive inputs for complex tasks.

Key Characteristics

Base Model: Fine-tuned from Qwen/Qwen2.5-7B-Instruct.
Parameter Count: 7.6 billion parameters.
Context Length: 131072 tokens.
Training Data: Specialized fine-tuning on the 4o_annotated_olympiads dataset, indicating a focus on tasks related to competitive problem-solving or academic challenges.

Training Details

The fine-tuning process involved a learning rate of 1e-05, a total batch size of 96 (with 3 gradient accumulation steps across 32 GPUs), and 3 epochs. The optimizer used was AdamW with standard betas and epsilon, and a cosine learning rate scheduler with a 0.1 warmup ratio. The training utilized Transformers 4.46.1, Pytorch 2.5.1, Datasets 3.0.2, and Tokenizers 0.20.3.

Intended Use Cases

Given its fine-tuning on the Olympiads dataset, this model is likely best suited for:

Complex Reasoning: Tasks requiring advanced logical deduction and problem-solving skills.
Academic Support: Applications related to competitive mathematics, science, or similar academic challenges.
Specialized Q&A: Answering questions that demand deep understanding and analytical capabilities, particularly within the domain of the Olympiads dataset.

Overview

Model Overview

Key Characteristics

Training Details

Intended Use Cases

Full Model Card (README)