DCAgent/g1_min_episodes_e1_gpt_long_sampled_swesmith_psu_thinking_tacc-Qwen3-32B
DCAgent/g1_min_episodes_e1_gpt_long_sampled_swesmith_psu_thinking_tacc-Qwen3-32B is a 32 billion parameter language model, fine-tuned from Qwen/Qwen3-32B. This model was specifically trained on the /scratch/08134/negin/hub/datasets--DCAgent--g1_min_episodes_e1_gpt_long_sampled_swesmith_psu_d1_original_40k_glm47_traces/snapshots/857b3ce8060050ded9af40dc129460f566d0c635_thinking_preprocessed dataset. It is designed for tasks related to the specific data it was fine-tuned on, likely focusing on reasoning or thought processes captured in that dataset.
Loading preview...
Model Overview
This model, g1_min_episodes_e1_gpt_long_sampled_swesmith_psu_thinking_tacc-Qwen3-32B, is a specialized fine-tune of the Qwen3-32B base model. It leverages a 32 billion parameter architecture and a context length of 32768 tokens, making it suitable for processing extensive inputs.
Key Capabilities
- Specialized Fine-tuning: The model has been fine-tuned on a unique dataset:
/scratch/08134/negin/hub/datasets--DCAgent--g1_min_episodes_e1_gpt_long_sampled_swesmith_psu_d1_original_40k_glm47_traces/snapshots/857b3ce8060050ded9af40dc129460f566d0c635_thinking_preprocessed. This indicates a focus on tasks or data characteristics present within this specific training corpus. - Base Model Strength: Inherits the foundational capabilities of the Qwen3-32B model, which typically includes strong language understanding and generation.
Training Details
The fine-tuning process involved 7 epochs with a learning rate of 4e-05, utilizing a distributed setup across 32 GPUs. The optimizer used was ADAMW_TORCH_FUSED with specific beta and epsilon parameters, and a cosine learning rate scheduler with a 0.1 warmup ratio.
Good For
- Research and Development: Ideal for researchers and developers working with or interested in the specific data distribution of the fine-tuning dataset.
- Domain-Specific Applications: Potentially useful for applications that require understanding or generating text aligned with the 'thinking' processes or traces present in its training data.