mlfoundations-dev/openr1_codeforces

TEXT GENERATIONConcurrency Cost:1Model Size:7.6BQuant:FP8Ctx Length:32kPublished:May 5, 2025License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

The mlfoundations-dev/openr1_codeforces model is a fine-tuned version of the Qwen/Qwen2.5-7B-Instruct architecture. This model has been specifically adapted using the mlfoundations-dev/openr1_codeforces dataset. It is intended for tasks related to its fine-tuning domain, leveraging the base model's instruction-following capabilities.

Loading preview...

Overview

This model, mlfoundations-dev/openr1_codeforces, is a specialized fine-tune of the Qwen/Qwen2.5-7B-Instruct base model. It has been adapted using the mlfoundations-dev/openr1_codeforces dataset, suggesting an optimization for tasks relevant to that specific data domain.

Training Details

The fine-tuning process involved several key hyperparameters:

  • Learning Rate: 4e-05
  • Batch Size: 1 (train), 8 (eval)
  • Gradient Accumulation Steps: 4, leading to a total effective batch size of 128
  • Optimizer: ADAMW_TORCH with standard betas and epsilon
  • LR Scheduler: cosine with a 0.1 warmup ratio
  • Epochs: 5.0

Intended Uses

Given its fine-tuning on the mlfoundations-dev/openr1_codeforces dataset, this model is likely intended for applications within the scope of that dataset's content. Users should refer to the dataset's documentation for specific use cases and potential limitations.