laion/Qwen3-8B_exp_tas_temp_0.25_traces_save-strategy_steps
TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:32kPublished:Jan 9, 2026License:apache-2.0Architecture:Transformer Open Weights Cold
This model is an 8 billion parameter fine-tuned version of the Qwen3-8B architecture, developed by Qwen. It has been specifically fine-tuned on the DCAgent/exp_tas_temp_0.25_traces dataset. The model's training involved a cosine learning rate scheduler with a warmup ratio of 0.005 over 8 epochs, utilizing a distributed multi-GPU setup. Its primary differentiation lies in its specialized fine-tuning for tasks related to the DCAgent/exp_tas_temp_0.25_traces dataset.
Loading preview...