DCAgent/g1_subagent_e1_gpt_long_tacc

TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:32kPublished:Apr 15, 2026License:otherArchitecture:Transformer Cold

DCAgent/g1_subagent_e1_gpt_long_tacc is an 8 billion parameter language model fine-tuned from Qwen/Qwen3-8B. It is specifically trained on the DCAgent/g1_subagent_e1_gpt_long_d1_original_40k_glm47_traces dataset, indicating a specialization in agentic or long-context reasoning tasks. With a 32768 token context length, this model is optimized for processing and generating extensive textual information relevant to its fine-tuning data.

Loading preview...

Model Overview

DCAgent/g1_subagent_e1_gpt_long_tacc is an 8 billion parameter language model, fine-tuned from the robust Qwen3-8B architecture. This model has been specialized through training on the DCAgent/g1_subagent_e1_gpt_long_d1_original_40k_glm47_traces dataset, suggesting an emphasis on tasks related to agentic behavior, long-context understanding, or specific trace analysis.

Key Training Details

The fine-tuning process utilized the following hyperparameters:

  • Learning Rate: 4e-05
  • Batch Size: 1 (train), 8 (eval)
  • Optimizer: ADAMW_TORCH_FUSED with betas=(0.9, 0.98) and epsilon=1e-08
  • LR Scheduler: Cosine type with 0.1 warmup ratio
  • Epochs: 7.0
  • Devices: Trained across 16 GPUs, achieving a total train batch size of 16 and eval batch size of 128.

Intended Use Cases

While specific intended uses and limitations require further information, the model's fine-tuning on a specialized dataset implies its suitability for:

  • Processing and generating content within the domain of agentic systems.
  • Handling tasks that benefit from a 32768 token context window, such as summarizing long documents or complex conversational threads related to its training data.
  • Applications requiring analysis or generation based on 'trace' data, as indicated by the dataset name.