DCAgent/g1_weighted_100k_32b_cont
DCAgent/g1_weighted_100k_32b_cont is a 32 billion parameter language model fine-tuned by DCAgent. It is based on the Qwen3-32B architecture and further fine-tuned from DCAgent/g1_weighted_100k_32B_step4400. This model was trained on a specific dataset derived from 'g1_min_episodes_e1_weighted_top4_100k_glm47_traces', suggesting a specialization in areas related to its training data. Its 32768 token context length supports processing extensive inputs.
Loading preview...
Model Overview
DCAgent/g1_weighted_100k_32b_cont is a 32 billion parameter language model developed by DCAgent. It is a fine-tuned iteration of the Qwen3-32B architecture, specifically building upon the DCAgent/g1_weighted_100k_32B_step4400 model.
Training Details
This model was fine-tuned on a specialized dataset: /e/scratch/jureap59/raoof1/sft_data/hf_hub/datasets--DCAgent--g1_min_episodes_e1_weighted_top4_100k_glm47_traces/snapshots/bfc1346c9f8ed847f8803c7e766846c69f1de24a_thinking_preprocessed. The training process involved:
- Learning Rate: 4e-05
- Optimizer: ADAMW_TORCH_FUSED with betas=(0.9, 0.98) and epsilon=1e-08
- Batch Size: A total training batch size of 96 across 96 devices
- Epochs: 2.5
- Context Length: The model supports a context length of 32768 tokens.
Intended Uses & Limitations
Specific intended uses and known limitations are not detailed in the provided model card. Users should evaluate its performance on tasks relevant to the g1_min_episodes_e1_weighted_top4_100k_glm47_traces dataset, as its fine-tuning suggests a specialization in areas covered by this data.