DCAgent/g1_clean_hybrid_plus_32b

TEXT GENERATIONConcurrency Cost:2Model Size:32BQuant:FP8Ctx Length:32kPublished:Apr 24, 2026License:otherArchitecture:Transformer Cold

DCAgent/g1_clean_hybrid_plus_32b is a 32 billion parameter language model fine-tuned from Qwen/Qwen3-32B. This model was trained on a specialized dataset, "g1_clean_hybrid_scaffold_plus_r2eg_gfi_38k_glm47_traces," suggesting an optimization for specific agentic or reasoning tasks. With a context length of 32768 tokens, it is designed for processing extensive inputs and generating detailed responses.

Loading preview...

Model Overview

DCAgent/g1_clean_hybrid_plus_32b is a 32 billion parameter language model, fine-tuned from the Qwen/Qwen3-32B base architecture. This model was developed by DCAgent and specifically trained on the /e/scratch/jureap59/raoof1/sft_data/hf_hub/datasets--DCAgent--g1_clean_hybrid_scaffold_plus_r2eg_gfi_38k_glm47_traces dataset. The specialized training data indicates a focus on enhancing performance for particular tasks, likely involving complex reasoning, scaffolding, or trace-based interactions.

Training Details

The model underwent supervised fine-tuning (SFT) with the following key hyperparameters:

  • Learning Rate: 4e-05
  • Batch Size: 1 (train), 8 (eval) with a total distributed batch size of 96 (train) and 768 (eval) across 96 devices.
  • Optimizer: ADAMW_TORCH_FUSED with betas=(0.9, 0.98) and epsilon=1e-08.
  • Scheduler: Cosine learning rate scheduler with a 0.1 warmup ratio.
  • Epochs: 5.0

This configuration suggests a robust training process aimed at leveraging the large parameter count and extensive context window (32768 tokens) of the Qwen3-32B base model for specialized applications.