DCAgent/g1_weighted_31600_32B

TEXT GENERATIONConcurrency Cost:2Model Size:32BQuant:FP8Ctx Length:32kPublished:Apr 21, 2026License:otherArchitecture:Transformer Cold

DCAgent/g1_weighted_31600_32B is a 32 billion parameter language model fine-tuned from Qwen/Qwen3-32B. This model was trained on a specific dataset derived from "DCAgent--g1_min_episodes_e1_weighted_top4_31600_glm47_traces" with a context length of 32768 tokens. It is a specialized iteration of the Qwen3 architecture, focusing on performance within its fine-tuning domain.

Loading preview...

Model Overview

DCAgent/g1_weighted_31600_32B is a 32 billion parameter language model, fine-tuned from the base Qwen/Qwen3-32B architecture. This model was specifically trained on a dataset identified as /e/scratch/jureap59/raoof1/sft_data/hf_hub/datasets--DCAgent--g1_min_episodes_e1_weighted_top4_31600_glm47_traces/snapshots/a4717e999b7f8e9ad717b435f2d4a5cc75535932_thinking_preprocessed.

Training Details

The fine-tuning process utilized the following key hyperparameters:

  • Learning Rate: 4e-05
  • Batch Size: 1 (train), 8 (eval) with a total distributed batch size of 96 (train) and 768 (eval) across 96 devices.
  • Optimizer: ADAMW_TORCH_FUSED with betas=(0.9, 0.999) and epsilon=1e-08.
  • LR Scheduler: Cosine type with a warmup ratio of 0.1.
  • Epochs: 7.0

The training was conducted using Transformers 4.57.6, Pytorch 2.9.1+cu130, Datasets 4.7.0, and Tokenizers 0.22.2.

Key Characteristics

  • Base Model: Qwen3-32B
  • Parameter Count: 32 billion
  • Context Length: 32768 tokens
  • Fine-tuning Focus: Specialized on the g1_min_episodes_e1_weighted_top4_31600_glm47_traces dataset, suggesting a focus on specific trace or episode-based data processing.