Name: DCAgent/b1_top16 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: DCAgent

DCAgent/b1_top16: Fine-tuned Qwen3-8B Model

DCAgent/b1_top16 is an 8 billion parameter language model derived from the Qwen/Qwen3-8B architecture. This model has undergone specific fine-tuning on a unique dataset, /e/scratch/jureap59/raoof1/sft_data/hf_hub/datasets--DCAgent--b1_top16/snapshots/2be82814777f95e38b73694deed12e34f91ca466_thinking_preprocessed, indicating a specialization for tasks aligned with this data.

Key Training Details

Base Model: Qwen/Qwen3-8B
Parameter Count: 8 billion
Context Length: 32768 tokens
Learning Rate: 4e-05
Optimizer: AdamW_Torch_Fused with betas=(0.9, 0.98) and epsilon=1e-08
Scheduler: Cosine learning rate scheduler with 0.1 warmup ratio
Epochs: 7.0
Batch Size: 1 (train), 8 (eval) with a total effective batch size of 16 (train) and 128 (eval) across 16 devices.

Potential Use Cases

Given its fine-tuning on a specific dataset, this model is likely best suited for applications that align with the characteristics and content of the b1_top16_thinking_preprocessed dataset. Developers should investigate the nature of this dataset to determine optimal use cases. Its 32K context window allows for processing longer inputs relevant to its specialized domain.

Overview

DCAgent/b1_top16: Fine-tuned Qwen3-8B Model

Key Training Details

Potential Use Cases

Full Model Card (README)