Name: DCAgent/d1_original_top4_seq_glm47 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: DCAgent

Model Overview

DCAgent/d1_original_top4_seq_glm47 is an 8 billion parameter language model, fine-tuned from the base Qwen/Qwen3-8B architecture. This model has been specifically trained on the /e/scratch/jureap59/raoof1/sft_data/hf_hub/datasets--DCAgent--d1_original_top4_seq_glm47_traces/snapshots/51e9b1f18d6b9acfb3afe34371782c3ddc5a60c0_thinking_preprocessed dataset. The fine-tuning process involved a learning rate of 4e-05 over 7 epochs, utilizing a multi-GPU setup with 16 devices and a total batch size of 16.

Key Characteristics

Base Model: Qwen3-8B, providing a robust foundation.
Specialized Training: Fine-tuned on a unique dataset, suggesting optimization for tasks related to sequential decision-making or agent traces.
Training Configuration: Employed AdamW_Torch_Fused optimizer with a cosine learning rate scheduler and a warmup ratio of 0.1.

Intended Use Cases

Given its specialized training data, this model is likely best suited for:

Applications requiring understanding or generation based on sequential agent actions or thought processes.
Research into agent behavior modeling or trace analysis.
Tasks where the specific patterns within the d1_original_top4_seq_glm47_traces dataset are relevant.

Overview

Model Overview

Key Characteristics

Intended Use Cases

Full Model Card (README)