Name: DCAgent/g1_clean_hybrid_25k_8b API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: DCAgent

Model Overview

DCAgent/g1_clean_hybrid_25k_8b is an 8 billion parameter language model, fine-tuned from the base Qwen/Qwen3-8B architecture. This model has undergone specialized training on the /e/scratch/jureap59/raoof1/sft_data/hf_hub/datasets--DCAgent--g1_clean_hybrid_scaffold_25k_glm47_traces/snapshots/ad622359a4cfbac08ec8e7bbe09f4f41a72a1834_thinking_preprocessed dataset.

Training Details

The fine-tuning process utilized specific hyperparameters to optimize performance:

Base Model: Qwen/Qwen3-8B
Learning Rate: 4e-05
Batch Size: 1 (train), 8 (eval)
Gradient Accumulation: 2 steps
Total Training Batch Size: 96
Optimizer: ADAMW_TORCH_FUSED with betas=(0.9, 0.98) and epsilon=1e-08
LR Scheduler: Cosine with 0.1 warmup ratio
Epochs: 7.0
Devices: 48 multi-GPU setup

Intended Use Cases

Given its fine-tuning on a specific dataset, this model is best suited for applications that align with the characteristics and content of the g1_clean_hybrid_scaffold_25k_glm47_traces dataset. Developers should consider its specialized training for tasks requiring nuanced understanding or generation within that domain. The model supports a context length of 32768 tokens.

Overview

Model Overview

Training Details

Intended Use Cases

Full Model Card (README)