DCAgent/b1_top2
TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:32kPublished:Apr 6, 2026License:otherArchitecture:Transformer Cold

DCAgent/b1_top2 is an 8 billion parameter language model fine-tuned from Qwen/Qwen3-8B. This model was trained on the /e/scratch/jureap59/raoof1/sft_data/hf_hub/datasets--DCAgent--b1_top2/snapshots/7330ca104c461646ff245d24b334368c45841bf0_thinking_preprocessed dataset. It utilizes a 32768 token context length and was fine-tuned using specific hyperparameters including a learning rate of 4e-05 and 7 epochs. Further details on its specific capabilities and intended uses are not provided in the available documentation.

Loading preview...

Overview

DCAgent/b1_top2 is an 8 billion parameter language model derived from the Qwen3-8B architecture. It has been fine-tuned on a specific dataset, /e/scratch/jureap59/raoof1/sft_data/hf_hub/datasets--DCAgent--b1_top2/snapshots/7330ca104c461646ff245d24b334368c45841bf0_thinking_preprocessed, suggesting a specialized application or domain for its training. The model supports a substantial context length of 32768 tokens, which is beneficial for processing longer inputs and maintaining conversational coherence over extended interactions.

Training Details

The fine-tuning process involved several key hyperparameters:

  • Base Model: Qwen/Qwen3-8B
  • Learning Rate: 4e-05
  • Optimizer: ADAMW_TORCH_FUSED with betas=(0.9, 0.98) and epsilon=1e-08
  • Scheduler: Cosine learning rate scheduler with a warmup ratio of 0.1
  • Epochs: 7.0
  • Batch Size: A total training batch size of 16 across 16 devices.

Intended Uses & Limitations

Specific intended uses and limitations for DCAgent/b1_top2 are not detailed in the provided model card. Developers should refer to the base model's documentation for general capabilities and conduct further evaluation to determine its suitability for particular tasks. The specialized training dataset implies potential strengths in areas related to the dataset's content, but without further information, its primary differentiators remain to be fully defined.