mlfoundations-dev/openr1_codeforces
The mlfoundations-dev/openr1_codeforces model is a fine-tuned version of the Qwen/Qwen2.5-7B-Instruct architecture. This model has been specifically adapted using the mlfoundations-dev/openr1_codeforces dataset. It is intended for tasks related to its fine-tuning domain, leveraging the base model's instruction-following capabilities.
Loading preview...
Overview
This model, mlfoundations-dev/openr1_codeforces, is a specialized fine-tune of the Qwen/Qwen2.5-7B-Instruct base model. It has been adapted using the mlfoundations-dev/openr1_codeforces dataset, suggesting an optimization for tasks relevant to that specific data domain.
Training Details
The fine-tuning process involved several key hyperparameters:
- Learning Rate:
4e-05 - Batch Size:
1(train),8(eval) - Gradient Accumulation Steps:
4, leading to a total effective batch size of128 - Optimizer:
ADAMW_TORCHwith standard betas and epsilon - LR Scheduler:
cosinewith a0.1warmup ratio - Epochs:
5.0
Intended Uses
Given its fine-tuning on the mlfoundations-dev/openr1_codeforces dataset, this model is likely intended for applications within the scope of that dataset's content. Users should refer to the dataset's documentation for specific use cases and potential limitations.