Run GLM-4.6-stackexchange-overflow-sandboxes-32eps-65K-reasoning_num-train-epochs_4.0_Qwen3-32B API (Easy Deployment & Flat-Rate Pricing)

Name: laion/GLM-4.6-stackexchange-overflow-sandboxes-32eps-65k-reasoning_num-train-epochs_4.0_Qwen3-32B API
Brand: Featherless.ai
Price: 25.00 USD
Availability: InStock
Author: laion

Model Overview

This model, laion/GLM-4.6-stackexchange-overflow-sandboxes-32eps-65k-reasoning_num-train-epochs_4.0_Qwen3-32B, is a 32 billion parameter language model built upon the Qwen3 architecture. It was trained from scratch over 4.0 epochs, utilizing a substantial context length of 32768 tokens.

Training Details

The training process involved specific hyperparameters:

Learning Rate: 4e-05
Batch Size: 1 (train), 8 (eval)
Gradient Accumulation Steps: 4, leading to a total effective batch size of 64
Optimizer: ADAMW_TORCH_FUSED with betas=(0.9, 0.999) and epsilon=1e-08
LR Scheduler: Cosine type with a warmup ratio of 0.1
Epochs: 4.0

The model was developed using Transformers 4.57.3, Pytorch 2.9.0+cu128, Datasets 4.4.1, and Tokenizers 0.22.1.

Capabilities and Use Cases

Due to limited available documentation, specific capabilities, primary differentiators, and intended use cases beyond its foundational architecture and training parameters are not detailed. Users should conduct further evaluation to determine its suitability for particular applications.

Overview

Model Overview

Training Details

Capabilities and Use Cases

Full Model Card (README)