Name: DCAgent/g1_weighted_100k_8b_v2 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: DCAgent

Model Overview

DCAgent/g1_weighted_100k_8b_v2 is an 8 billion parameter language model, fine-tuned from the base model Qwen/Qwen3-8B. This model was developed by DCAgent and trained on a specialized dataset, /e/scratch/jureap59/raoof1/sft_data/hf_hub/datasets--DCAgent--g1_min_episodes_e1_weighted_top4_100k_glm47_traces, indicating a focus on tasks or data characteristics present within this specific training corpus.

Training Details

The model underwent fine-tuning with the following key hyperparameters:

Learning Rate: 4e-05
Optimizer: ADAMW_TORCH_FUSED with betas=(0.9, 0.98) and epsilon=1e-08
Batch Size: A total training batch size of 96 (train_batch_size: 1, gradient_accumulation_steps: 2, num_devices: 48)
Epochs: 5.0
Scheduler: Cosine learning rate scheduler with a warmup ratio of 0.1

This configuration suggests a robust training approach designed to adapt the base Qwen3-8B model to the nuances of the DCAgent-specific dataset. The model's 32768 token context length allows for handling substantial input sequences.

Overview

Model Overview

Training Details

Full Model Card (README)