mlfoundations-dev/qwen2-5_code_ablate_duplications_1
The mlfoundations-dev/qwen2-5_code_ablate_duplications_1 is a 7.6 billion parameter causal language model, fine-tuned from Qwen/Qwen2.5-7B-Instruct. This model was trained on the mlfoundations-dev/code_ablate_duplications_1 dataset, suggesting a specialization in code-related tasks or analysis of code duplications. With a context length of 131072 tokens, it is designed for processing extensive code sequences.
Loading preview...
Model Overview
This model, mlfoundations-dev/qwen2-5_code_ablate_duplications_1, is a 7.6 billion parameter language model. It is a fine-tuned variant of the Qwen/Qwen2.5-7B-Instruct base model, developed by Qwen. The fine-tuning process utilized the mlfoundations-dev/code_ablate_duplications_1 dataset, indicating a potential focus on code analysis, particularly concerning code duplication.
Key Training Details
The model was trained with a learning rate of 1e-05 over 3.0 epochs, using an AdamW optimizer. It leveraged a distributed training setup across 32 devices with a total batch size of 96 and a cosine learning rate scheduler with a 0.1 warmup ratio. The training environment included Transformers 4.46.1, Pytorch 2.5.1, Datasets 3.0.2, and Tokenizers 0.20.3.
Potential Use Cases
Given its fine-tuning on a code duplication dataset, this model is likely suitable for:
- Code analysis: Identifying and understanding duplicated code segments.
- Code quality assessment: Assisting in refactoring efforts by highlighting redundant code.
- Software engineering research: Exploring patterns and impacts of code duplication.