DCAgent/d1_original_top4_seq_glm47
DCAgent/d1_original_top4_seq_glm47 is an 8 billion parameter language model fine-tuned from Qwen/Qwen3-8B. This model is specifically trained on a dataset derived from 'd1_original_top4_seq_glm47_traces', indicating a specialization in sequential decision-making or agent-based tasks. It leverages the Qwen3 architecture and is optimized for specific applications related to its training data, offering a 32768 token context length.
Loading preview...
Model Overview
DCAgent/d1_original_top4_seq_glm47 is an 8 billion parameter language model, fine-tuned from the base Qwen/Qwen3-8B architecture. This model has been specifically trained on the /e/scratch/jureap59/raoof1/sft_data/hf_hub/datasets--DCAgent--d1_original_top4_seq_glm47_traces/snapshots/51e9b1f18d6b9acfb3afe34371782c3ddc5a60c0_thinking_preprocessed dataset. The fine-tuning process involved a learning rate of 4e-05 over 7 epochs, utilizing a multi-GPU setup with 16 devices and a total batch size of 16.
Key Characteristics
- Base Model: Qwen3-8B, providing a robust foundation.
- Specialized Training: Fine-tuned on a unique dataset, suggesting optimization for tasks related to sequential decision-making or agent traces.
- Training Configuration: Employed AdamW_Torch_Fused optimizer with a cosine learning rate scheduler and a warmup ratio of 0.1.
Intended Use Cases
Given its specialized training data, this model is likely best suited for:
- Applications requiring understanding or generation based on sequential agent actions or thought processes.
- Research into agent behavior modeling or trace analysis.
- Tasks where the specific patterns within the
d1_original_top4_seq_glm47_tracesdataset are relevant.