DCAgent/g1_gptlong_top8_32b

TEXT GENERATIONConcurrency Cost:2Model Size:32BQuant:FP8Ctx Length:32kPublished:Apr 23, 2026License:otherArchitecture:Transformer Cold

DCAgent/g1_gptlong_top8_32b is a 32 billion parameter language model fine-tuned from Qwen/Qwen3-32B. This model is specifically adapted using a dataset derived from 'g1_min_episodes_e1_gpt_long_top8_glm47_traces', suggesting an optimization for processing and generating long-context or trace-based sequences. Its training procedure involved a multi-GPU setup with 96 devices, indicating a focus on robust performance for specific, potentially complex, language tasks.

Loading preview...

Overview

DCAgent/g1_gptlong_top8_32b is a 32 billion parameter language model, fine-tuned from the Qwen/Qwen3-32B architecture. It has been specifically adapted using a dataset named 'g1_min_episodes_e1_gpt_long_top8_glm47_traces'. This fine-tuning process suggests an emphasis on handling and generating content related to long-context interactions or trace data, potentially for agent-based systems or complex reasoning tasks.

Training Details

The model underwent training with a learning rate of 4e-05, a batch size of 1 per device, and utilized 96 GPUs for a total effective batch size of 96. The optimizer used was ADAMW_TORCH_FUSED with cosine learning rate scheduling over 5 epochs. This configuration indicates a substantial training effort aimed at specializing the model's capabilities.

Key Characteristics

  • Base Model: Qwen/Qwen3-32B
  • Parameter Count: 32 billion
  • Fine-tuning Dataset: 'g1_min_episodes_e1_gpt_long_top8_glm47_traces'
  • Training Environment: Multi-GPU setup with 96 devices

Intended Use Cases

While specific intended uses are not detailed in the provided README, the fine-tuning on a 'gpt_long_top8_glm47_traces' dataset implies suitability for applications requiring:

  • Processing and understanding extended conversational histories or complex interaction traces.
  • Generating coherent and contextually relevant responses in long-form scenarios.
  • Tasks related to agent behavior analysis or simulation where detailed trace data is crucial.