Name: DCAgent/b1_top8_seq API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: DCAgent

Overview

DCAgent/b1_top8_seq is an 8 billion parameter language model, fine-tuned from the base model Qwen/Qwen3-8B. The fine-tuning process utilized the dataset located at /scratch/08134/negin/hub/datasets--DCAgent--b1_top8_seq/snapshots/431317fbde90fded83a2730a01e3e4bcc5981bd2.

Training Details

The model was trained with the following key hyperparameters:

Learning Rate: 4e-05
Optimizer: ADAMW_TORCH_FUSED with betas=(0.9, 0.98) and epsilon=1e-08
Batch Size: A total training batch size of 16 (1 per device across 16 GPUs)
Epochs: 7.0
LR Scheduler: Cosine type with a warmup ratio of 0.1

Limitations

Specific details regarding the model's intended uses, limitations, and the nature of its training and evaluation data are not provided in the available documentation. Users should exercise caution and conduct further investigation to determine its suitability for specific applications.

Overview

Overview

Training Details

Limitations

Full Model Card (README)