Name: DCAgent/b1_top1 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: DCAgent

Model Overview

DCAgent/b1_top1 is an 8 billion parameter language model, fine-tuned from the robust Qwen/Qwen3-8B base architecture. This model has undergone specialized training on a unique dataset, /e/scratch/jureap59/raoof1/sft_data/hf_hub/datasets--DCAgent--b1_top1/snapshots/b2309d14459711bdc32a92285257bc916445bbdc_thinking_preprocessed, indicating a focus on particular tasks or domains. It supports a substantial context length of 32768 tokens, allowing for processing and understanding of lengthy inputs.

Training Details

The fine-tuning process utilized specific hyperparameters to optimize performance:

Learning Rate: 4e-05
Batch Sizes: 1 (train), 8 (eval)
Optimizer: ADAMW_TORCH_FUSED with betas=(0.9,0.98) and epsilon=1e-08
LR Scheduler: Cosine type with a warmup ratio of 0.1
Epochs: 7.0
Distributed Training: Multi-GPU setup across 16 devices.

This configuration suggests a thorough fine-tuning approach aimed at adapting the base Qwen3-8B model for specific applications, likely within the domain of the custom training dataset. While specific intended uses and limitations are not detailed in the provided information, its fine-tuned nature implies enhanced performance for tasks aligned with its training data.

Overview

Model Overview

Training Details

Full Model Card (README)