Run GLM-4.6-stackexchange-overflow-sandboxes-32eps-65K-reasoning_num-train-epochs_6.0_Qwen3-32B API (Easy Deployment & Flat-Rate Pricing)

Name: laion/GLM-4.6-stackexchange-overflow-sandboxes-32eps-65k-reasoning_num-train-epochs_6.0_Qwen3-32B API
Brand: Featherless.ai
Price: 25.00 USD
Availability: InStock
Author: laion

Model Overview

This model, laion/GLM-4.6-stackexchange-overflow-sandboxes-32eps-65k-reasoning_num-train-epochs_6.0_Qwen3-32B, is a 32 billion parameter language model built upon the Qwen3 architecture. It was trained with a substantial context length of 32768 tokens, indicating its potential for handling extensive textual inputs.

Training Details

The model underwent training for 6.0 epochs using a distributed setup across 16 GPUs. Key hyperparameters included a learning rate of 4e-05, a total training batch size of 64 (achieved with a train_batch_size of 1 and gradient_accumulation_steps of 4), and the AdamW_TORCH_FUSED optimizer. A cosine learning rate scheduler with a 0.1 warmup ratio was employed to manage the learning rate throughout the training process.

Key Characteristics

Architecture: Qwen3-32B
Parameter Count: 32 billion
Context Length: 32768 tokens
Training Epochs: 6.0
Optimizer: AdamW_TORCH_FUSED

Good for

General language understanding tasks requiring a large context window.
Applications benefiting from a 32B parameter model for text generation and analysis.

Further details regarding the specific training dataset, intended uses, and performance benchmarks are not provided in the current model card.

Overview

Model Overview

Training Details

Key Characteristics

Good for

Full Model Card (README)