mlfoundations-dev/qwen2-5_code_ablate_duplications_1

TEXT GENERATIONConcurrency Cost:1Model Size:7.6BQuant:FP8Ctx Length:32kPublished:Mar 29, 2025License:apache-2.0Architecture:Transformer Open Weights Cold

The mlfoundations-dev/qwen2-5_code_ablate_duplications_1 is a 7.6 billion parameter causal language model, fine-tuned from Qwen/Qwen2.5-7B-Instruct. This model was trained on the mlfoundations-dev/code_ablate_duplications_1 dataset, suggesting a specialization in code-related tasks or analysis of code duplications. With a context length of 131072 tokens, it is designed for processing extensive code sequences.

Loading preview...

Model Overview

This model, mlfoundations-dev/qwen2-5_code_ablate_duplications_1, is a 7.6 billion parameter language model. It is a fine-tuned variant of the Qwen/Qwen2.5-7B-Instruct base model, developed by Qwen. The fine-tuning process utilized the mlfoundations-dev/code_ablate_duplications_1 dataset, indicating a potential focus on code analysis, particularly concerning code duplication.

Key Training Details

The model was trained with a learning rate of 1e-05 over 3.0 epochs, using an AdamW optimizer. It leveraged a distributed training setup across 32 devices with a total batch size of 96 and a cosine learning rate scheduler with a 0.1 warmup ratio. The training environment included Transformers 4.46.1, Pytorch 2.5.1, Datasets 3.0.2, and Tokenizers 0.20.3.

Potential Use Cases

Given its fine-tuning on a code duplication dataset, this model is likely suitable for:

  • Code analysis: Identifying and understanding duplicated code segments.
  • Code quality assessment: Assisting in refactoring efforts by highlighting redundant code.
  • Software engineering research: Exploring patterns and impacts of code duplication.