Model Overview

This model, sft_GLM-4-7-swesmith-sandboxes-with_tests-oracle_verified_120s-maxeps-131k_Qwen3-32B, is a fine-tuned variant of the Qwen/Qwen3-32B architecture. It has been specifically adapted using a unique dataset derived from GLM-4.7-swesmith-sandboxes-with_tests-oracle_verified_120s-maxeps-131k.

Key Training Details

The model underwent training with the following hyperparameters:

Learning Rate: 4e-05
Batch Size: 1 (train), 8 (eval)
Gradient Accumulation: 2 steps, leading to a total effective batch size of 32
Optimizer: ADAMW_TORCH_FUSED
LR Scheduler: Cosine with 0.1 warmup ratio
Epochs: 7.0

Technical Stack

The training leveraged:

Transformers: 4.57.6
Pytorch: 2.9.0+cu128
Datasets: 4.4.1
Tokenizers: 0.22.2

Intended Use

While specific use cases are not detailed in the provided information, its fine-tuning on a specialized dataset suggests potential applications in areas related to the nature of the GLM-4.7-swesmith-sandboxes-with_tests-oracle_verified_120s-maxeps-131k data, likely involving complex reasoning or specific domain knowledge.

Overview

Model Overview

Key Training Details

Technical Stack

Intended Use

Full Model Card (README)