Name: DCAgent/b1_top8 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: DCAgent

Overview

DCAgent/b1_top8 is an 8 billion parameter language model, fine-tuned from the base Qwen/Qwen3-8B architecture. This model has been specifically trained on the /e/scratch/jureap59/raoof1/sft_data/hf_hub/datasets--DCAgent--b1_top8/snapshots/0261a53fb1e70a7ba1767f28710756d33ed1048e_thinking_preprocessed dataset. The fine-tuning process involved a learning rate of 4e-05, a total training batch size of 16 across 16 GPUs, and 7 epochs, utilizing an AdamW optimizer with a cosine learning rate scheduler.

Key Characteristics

Base Model: Qwen3-8B
Parameter Count: 8 billion
Context Length: 32,768 tokens
Training Data: Fine-tuned on a specific dataset (DCAgent/b1_top8_thinking_preprocessed), suggesting specialized capabilities related to this data.

Training Details

The model was trained using:

Learning Rate: 4e-05
Optimizer: AdamW_TORCH_FUSED
LR Scheduler: Cosine with 0.1 warmup ratio
Epochs: 7.0
Frameworks: Transformers 4.57.6, Pytorch 2.9.1+cu130, Datasets 4.7.0, Tokenizers 0.22.2.

Potential Use Cases

Given its fine-tuning on a specific dataset, this model is likely best suited for tasks that align with the characteristics and domain of the DCAgent/b1_top8_thinking_preprocessed data. Developers should evaluate its performance on tasks requiring deep understanding or generation within that specialized domain.

Overview

Overview

Key Characteristics

Training Details

Potential Use Cases

Full Model Card (README)