Name: laion/Qwen3-8B_exp_tas_temp_0.25_traces_save-strategy_steps API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: laion

Overview

This model, laion/Qwen3-8B_exp_tas_temp_0.25_traces_save-strategy_steps, is an 8 billion parameter language model based on the Qwen3-8B architecture. It has undergone specific fine-tuning on the DCAgent/exp_tas_temp_0.25_traces dataset.

Training Details

The fine-tuning process involved several key hyperparameters:

Learning Rate: 0.0001
Optimizer: ADAMW_TORCH_FUSED with betas=(0.87, 0.99) and epsilon=1e-08
LR Scheduler: Cosine type with a warmup ratio of 0.005
Epochs: 8.0
Batch Size: A total training batch size of 32 was achieved using a distributed multi-GPU setup (32 devices, 1 batch size per device).

Framework Versions

The model was trained using:

Transformers 4.55.0
Pytorch 2.7.1+cu128
Datasets 3.6.0
Tokenizers 0.21.1

Intended Use

While specific intended uses and limitations require more information, its fine-tuning on the DCAgent/exp_tas_temp_0.25_traces dataset suggests potential application in tasks related to the characteristics or domain of that dataset.

Overview

Overview

Training Details

Framework Versions

Intended Use

Full Model Card (README)