laion/exp_tas_timeout_multiplier_8_0_traces
The laion/exp_tas_timeout_multiplier_8_0_traces model is an 8 billion parameter language model, fine-tuned from Qwen/Qwen3-8B. It was specifically trained on the DCAgent/exp_tas_timeout_multiplier_8.0_traces dataset. This model is designed for specialized tasks related to the dataset it was fine-tuned on, likely involving trace analysis or timeout multiplier experiments. Its primary application is within the domain of its specific training data.
Loading preview...
Model Overview
This model, laion/exp_tas_timeout_multiplier_8_0_traces, is an 8 billion parameter language model derived from the Qwen/Qwen3-8B architecture. It has undergone specific fine-tuning on the DCAgent/exp_tas_timeout_multiplier_8.0_traces dataset, indicating a specialization towards tasks related to this particular data.
Key Training Details
The fine-tuning process utilized a learning rate of 4e-05, with a total batch size of 16 (achieved with a train_batch_size of 1 and gradient_accumulation_steps of 2) over 7 epochs. The optimizer used was ADAMW_TORCH_FUSED with standard betas and epsilon, and a cosine learning rate scheduler with a 0.1 warmup ratio. The training was distributed across 8 GPUs.
Intended Use Cases
Given its specialized fine-tuning on the DCAgent/exp_tas_timeout_multiplier_8.0_traces dataset, this model is primarily intended for applications directly related to the nature of that data. Users should consider its specific training domain when evaluating its suitability for their tasks. Further details on specific intended uses and limitations would require more information about the DCAgent/exp_tas_timeout_multiplier_8.0_traces dataset itself.