puchuneko/GLM-4-32B-0414-Z1-SLERP

TEXT GENERATIONConcurrency Cost:2Model Size:32BQuant:FP8Ctx Length:32kTool Calling:SupportedPublished:Jul 1, 2026License:mitArchitecture:Transformer Open Weights Cold

puchuneko/GLM-4-32B-0414-Z1-SLERP is a 32 billion parameter language model created by puchuneko, resulting from a SLERP merge of Zhipu AI's GLM-4-32B-0414 (instruct) and GLM-Z1-32B-0414 (reasoning) models. This experimental merge, weighted 70% instruct and 30% reasoning, significantly enhances mathematical reasoning, achieving 94.0% on GSM8K (5-shot) compared to its instruct parent's 88.3%. It is primarily designed to combine concise instruction following with improved reasoning capabilities, making it suitable for tasks requiring both clear answers and strong problem-solving. The model maintains a 32768 token context length.

Loading preview...

Overview

puchuneko/GLM-4-32B-0414-Z1-SLERP is an experimental 32 billion parameter model created by puchuneko, leveraging a SLERP merge of two Zhipu AI models: GLM-4-32B-0414 (instruct-tuned) and GLM-Z1-32B-0414 (reasoning-focused). The merge uses a t=0.3 parameter, meaning it's approximately 70% instruct and 30% reasoning.

Key Capabilities & Performance

  • Enhanced Mathematical Reasoning: The primary differentiator is a notable improvement in mathematical problem-solving. On the GSM8K (5-shot) benchmark, this merged model scores 94.0%, a +5.7 percentage point gain over its instruct parent's 88.3%.
  • Balanced Output: It aims to combine the concise answering style of the instruct parent with the improved reasoning skills of the reasoning parent.
  • SLERP Method: This model was specifically created using the SLERP (Spherical Linear Interpolation) merge method, as other techniques like dare_ties or ties resulted in incoherent outputs for this instruct+reasoning pair.

Limitations & Considerations

  • Limited Evaluation: Only GSM8K has been evaluated; general, safety, or Korean benchmarks are not available.
  • Potential for Regression: The instruct parent's capabilities are diluted by 30%, which may lead to regression in other areas not yet tested.
  • Experimental Artifact: This is presented as a transparent, reproducible experimental artifact rather than a production-ready or state-of-the-art (SOTA) model.
  • No Korean Fine-tuning: The model does not include Korean fine-tuning, which is noted as future work.