DCAgent/a1-defects4j

TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:32kPublished:Mar 25, 2026License:otherArchitecture:Transformer0.0K Cold

DCAgent/a1-defects4j is an 8 billion parameter language model, fine-tuned from Qwen/Qwen3-8B. This model was specifically trained on the /e/scratch/jureap59/raoof1/sft_data/hf_hub/datasets--DCAgent--exp_rpt_defects4j-v3_10k_glm_4.7_traces_jupiter/snapshots/c2c5dde80f4ceb33fc781ade3b1285cddb53b59a_thinking_preprocessed dataset. It is designed for tasks related to the specific dataset it was fine-tuned on, likely focusing on defect analysis or related software engineering applications.

Loading preview...

Overview

This model, DCAgent/a1-defects4j, is an 8 billion parameter language model derived from the Qwen/Qwen3-8B base architecture. It has undergone a specific fine-tuning process to specialize its capabilities.

Training Details

The model was fine-tuned on a unique dataset: /e/scratch/jureap59/raoof1/sft_data/hf_hub/datasets--DCAgent--exp_rpt_defects4j-v3_10k_glm_4.7_traces_jupiter/snapshots/c2c5dde80f4ceb33fc781ade3b1285cddb53b59a_thinking_preprocessed. Key training hyperparameters included:

  • Learning Rate: 4e-05
  • Batch Size: 1 (train), 8 (eval)
  • Optimizer: ADAMW_TORCH_FUSED with betas=(0.9,0.98) and epsilon=1e-08
  • LR Scheduler: Cosine type with 0.1 warmup ratio
  • Epochs: 7.0

Intended Use

While specific intended uses and limitations require more detailed information, the fine-tuning on a dataset related to "defects4j" suggests its application in areas such as software defect analysis, bug reporting, or code-related problem-solving. Users should consult further documentation for precise guidance on its optimal deployment and any known constraints.