DCAgent/g1_timeout_sampled_swesmith_psu
DCAgent/g1_timeout_sampled_swesmith_psu is an 8 billion parameter language model, fine-tuned from Qwen/Qwen3-8B. This model is specifically adapted using a dataset derived from g1_timeout_e1_gpt_long_sampled_swesmith_psu_d1_original_40k_glm47_traces, suggesting an optimization for tasks related to agentic behavior or specific trace analysis. It features a 32768 token context length, making it suitable for processing extensive inputs in its specialized domain.
Loading preview...
Model Overview
This model, DCAgent/g1_timeout_sampled_swesmith_psu, is an 8 billion parameter language model fine-tuned from the Qwen/Qwen3-8B architecture. It was developed by DCAgent and specifically adapted using a dataset named /e/scratch/jureap59/raoof1/sft_data/hf_hub/datasets--DCAgent--g1_timeout_e1_gpt_long_sampled_swesmith_psu_d1_original_40k_glm47_traces/snapshots/170576926cc40350565fd69f93cff2b048596abb_thinking_preprocessed.
Training Details
The fine-tuning process involved specific hyperparameters:
- Learning Rate: 4e-05
- Batch Sizes: 1 (train), 8 (eval)
- Optimizer: ADAMW_TORCH_FUSED with betas=(0.9,0.98) and epsilon=1e-08
- Scheduler: Cosine learning rate scheduler with 0.1 warmup ratio
- Epochs: 7.0
- Devices: Trained on 16 multi-GPU devices.
Potential Use Cases
Given its fine-tuning dataset, this model is likely optimized for tasks involving the analysis or generation of content related to agentic systems, trace data, or specific problem-solving scenarios as indicated by the dataset name. Its 32768 token context length supports handling detailed and lengthy inputs relevant to such applications.