Name: DCAgent/b1_top32 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: DCAgent

Model Overview

DCAgent/b1_top32 is an 8 billion parameter language model, fine-tuned from the Qwen/Qwen3-8B architecture. It was trained using a specific dataset located at /scratch/08134/negin/hub/datasets--DCAgent--b1_top32/snapshots/672f249bde596b1bd0c44d2ba49e33deda128ebd.

Training Details

The model was trained with a learning rate of 4e-05, a train batch size of 1, and an eval batch size of 8. It utilized 16 devices for distributed training, resulting in a total train batch size of 16 and a total eval batch size of 128. The optimizer used was ADAMW_TORCH_FUSED with specific beta and epsilon values, and a cosine learning rate scheduler with a warmup ratio of 0.1 was applied over 7 epochs. The training environment included Transformers 4.57.3, Pytorch 2.9.0+cu128, Datasets 4.4.1, and Tokenizers 0.22.1.

Key Characteristics

Base Model: Qwen3-8B
Parameter Count: 8 billion
Context Length: 32768 tokens

Further details regarding its specific intended uses, limitations, and performance benchmarks are not provided in the available documentation.

Overview

Model Overview

Training Details

Key Characteristics

Full Model Card (README)