Name: DCAgent/FourDatasetMixQwen3_8B API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: DCAgent

Model Overview

DCAgent/FourDatasetMixQwen3_8B is an 8 billion parameter language model derived from the Qwen/Qwen3-8B base architecture. It has undergone fine-tuning using the otagents_10k dataset, indicating a specialization for tasks or interactions represented in this specific dataset. The model supports a substantial context length of 32768 tokens, making it suitable for processing and generating longer text sequences.

Training Details

The fine-tuning process utilized the following key hyperparameters:

Learning Rate: 4e-05
Batch Size: A total effective batch size of 16 (1 train_batch_size with 4 gradient_accumulation_steps across 4 GPUs).
Optimizer: AdamW with specific beta values (0.9, 0.98) and epsilon (1e-08).
Scheduler: Cosine learning rate scheduler with a 0.1 warmup ratio.
Epochs: Trained for 5.0 epochs.

Potential Use Cases

Given its fine-tuning on the otagents_10k dataset, this model is likely best suited for applications that align with the characteristics and content of that dataset. Developers should evaluate its performance for tasks involving agent-like interactions, specific dialogue systems, or data generation within the domain covered by otagents_10k.

Overview

Model Overview

Training Details

Potential Use Cases

Full Model Card (README)