Name: DCAgent/b1_top2 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: DCAgent

Overview

DCAgent/b1_top2 is an 8 billion parameter language model derived from the Qwen3-8B architecture. It has been fine-tuned on a specific dataset, /e/scratch/jureap59/raoof1/sft_data/hf_hub/datasets--DCAgent--b1_top2/snapshots/7330ca104c461646ff245d24b334368c45841bf0_thinking_preprocessed, suggesting a specialized application or domain for its training. The model supports a substantial context length of 32768 tokens, which is beneficial for processing longer inputs and maintaining conversational coherence over extended interactions.

Training Details

The fine-tuning process involved several key hyperparameters:

Base Model: Qwen/Qwen3-8B
Learning Rate: 4e-05
Optimizer: ADAMW_TORCH_FUSED with betas=(0.9, 0.98) and epsilon=1e-08
Scheduler: Cosine learning rate scheduler with a warmup ratio of 0.1
Epochs: 7.0
Batch Size: A total training batch size of 16 across 16 devices.

Intended Uses & Limitations

Specific intended uses and limitations for DCAgent/b1_top2 are not detailed in the provided model card. Developers should refer to the base model's documentation for general capabilities and conduct further evaluation to determine its suitability for particular tasks. The specialized training dataset implies potential strengths in areas related to the dataset's content, but without further information, its primary differentiators remain to be fully defined.

Overview

Overview

Training Details

Intended Uses & Limitations

Full Model Card (README)