DCAgent/b1_top16
TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:32kPublished:Apr 7, 2026License:otherArchitecture:Transformer Cold

DCAgent/b1_top16 is an 8 billion parameter language model fine-tuned from Qwen/Qwen3-8B. This model was trained on a specific dataset, /e/scratch/jureap59/raoof1/sft_data/hf_hub/datasets--DCAgent--b1_top16/snapshots/2be82814777f95e38b73694deed12e34f91ca466_thinking_preprocessed, with a context length of 32768 tokens. It is optimized for tasks related to its fine-tuning data, suggesting specialized performance in areas covered by that dataset. The model leverages a cosine learning rate scheduler and AdamW_Torch_Fused optimizer over 7 epochs.

Loading preview...