mlfoundations-dev/llama3-1_8b_r1_annotated_olympiads

TEXT GENERATIONConcurrency Cost:1Model Size:7.6BQuant:FP8Ctx Length:32kLicense:apache-2.0Architecture:Transformer Open Weights Cold

The mlfoundations-dev/llama3-1_8b_r1_annotated_olympiads model is a 7.6 billion parameter language model fine-tuned from Qwen/Qwen2.5-7B-Instruct. It features a 131,072 token context length and is specifically adapted using the mlfoundations-dev/r1_annotated_olympiads dataset. This model is optimized for tasks related to the specific domain covered by the Olympiads dataset, suggesting a focus on reasoning or problem-solving within that context.

Loading preview...

Model Overview

This model, llama3-1_8b_r1_annotated_olympiads, is a fine-tuned variant of the Qwen/Qwen2.5-7B-Instruct architecture, featuring approximately 7.6 billion parameters and a substantial 131,072 token context window. It has been specifically adapted using the mlfoundations-dev/r1_annotated_olympiads dataset.

Training Details

The model was trained with a learning rate of 1e-05, a total effective batch size of 96, and utilized a cosine learning rate scheduler with a 0.1 warmup ratio over 3 epochs. The training leveraged a multi-GPU setup with 32 devices and AdamW optimizer. The underlying framework versions include Transformers 4.46.1, Pytorch 2.5.1, Datasets 3.0.2, and Tokenizers 0.20.3.

Potential Use Cases

Given its fine-tuning on the r1_annotated_olympiads dataset, this model is likely specialized for tasks involving:

  • Problem-solving and reasoning within the domain of Olympiad-style challenges.
  • Understanding and generating responses related to complex academic or competitive problems.

Further details on specific intended uses, limitations, and comprehensive evaluation data are not provided in the current model card.