DCAgent/d1_harden_then_constrain_top4_seq_glm47

TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:32kPublished:Apr 12, 2026License:otherArchitecture:Transformer Cold

DCAgent/d1_harden_then_constrain_top4_seq_glm47 is an 8 billion parameter language model fine-tuned from Qwen/Qwen3-8B. This model was specifically trained on the /e/scratch/jureap59/raoof1/sft_data/hf_hub/datasets--DCAgent--d1_harden_then_constrain_top4_seq_glm47_traces dataset. It is designed for tasks related to its specialized training data, offering a refined performance over its base model for specific sequence generation or understanding applications.

Loading preview...

Overview

DCAgent/d1_harden_then_constrain_top4_seq_glm47 is an 8 billion parameter language model, building upon the foundational architecture of Qwen/Qwen3-8B. This model has undergone a specific fine-tuning process, utilizing the /e/scratch/jureap59/raoof1/sft_data/hf_hub/datasets--DCAgent--d1_harden_then_constrain_top4_seq_glm47_traces dataset. The fine-tuning was conducted with a learning rate of 4e-05 over 7 epochs, employing a multi-GPU setup with 16 devices and a total training batch size of 16.

Training Details

  • Base Model: Qwen/Qwen3-8B
  • Training Dataset: /e/scratch/jureap59/raoof1/sft_data/hf_hub/datasets--DCAgent--d1_harden_then_constrain_top4_seq_glm47_traces
  • Key Hyperparameters:
    • Learning Rate: 4e-05
    • Optimizer: ADAMW_TORCH_FUSED
    • Number of Epochs: 7.0
    • Distributed Type: multi-GPU (16 devices)

Intended Use

Given its specialized fine-tuning, this model is likely optimized for tasks directly related to the characteristics and content of the d1_harden_then_constrain_top4_seq_glm47_traces dataset. Developers should consider its application in scenarios where performance on similar data distributions is critical. Further information regarding specific intended uses and limitations would require a deeper analysis of the training data's nature.