Name: DCAgent/d1_hardened_top4_seq_glm47 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: DCAgent

Model Overview

DCAgent/d1_hardened_top4_seq_glm47 is an 8 billion parameter language model, fine-tuned from the robust Qwen/Qwen3-8B architecture. This model has been specialized through training on the /e/scratch/jureap59/raoof1/sft_data/hf_hub/datasets--DCAgent--d1_hardened_top4_seq_glm47_traces dataset, indicating a focus on specific sequential data processing or generation tasks.

Key Training Details

The fine-tuning process involved several key hyperparameters:

Learning Rate: 4e-05
Optimizer: ADAMW_TORCH_FUSED with betas=(0.9, 0.98) and epsilon=1e-08
Scheduler: Cosine learning rate scheduler with a 0.1 warmup ratio
Epochs: 7.0
Batch Size: A total training batch size of 16 across 16 devices.

Potential Use Cases

Given its specialized training, this model is likely suitable for applications that align with the characteristics of the d1_hardened_top4_seq_glm47_traces dataset. Developers should evaluate its performance on tasks requiring deep understanding or generation of similar sequential data.

Overview

Model Overview

Key Training Details

Potential Use Cases

Full Model Card (README)