Name: laion/dev_set_part1_10k_glm_4_7_traces_jupiter API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: laion

Overview

This model, laion/dev_set_part1_10k_glm_4_7_traces_jupiter, is an 8 billion parameter language model derived from the Qwen/Qwen3-8B architecture. It has been specifically fine-tuned on a unique dataset located at /data/cat/ws/befe330h-befe330h-otagent/huggingface/hub/datasets--DCAgent--dev_set_part1_10k_glm_4.7_traces_jupiter/snapshots/f1871d1c1446b3b43cbfe2737d0df56cecf3f420_thinking_preprocessed.

Training Details

The fine-tuning process involved several key hyperparameters:

Learning Rate: 4e-05
Batch Size: 1 (train), 8 (eval)
Gradient Accumulation: 2 steps, leading to a total effective batch size of 16
Optimizer: ADAMW_TORCH_FUSED with specific beta and epsilon values
Scheduler: Cosine learning rate scheduler with a 0.1 warmup ratio
Epochs: 7.0

Intended Use

While specific intended uses and limitations require further information, its fine-tuning on a specialized dataset suggests it is optimized for tasks related to the nature of that data. Developers should consider the origin and specific training data when evaluating its suitability for their applications.

Overview

Overview

Training Details

Intended Use

Full Model Card (README)