puchuneko/GLM-4-32B-0414-Z1-SLERP
puchuneko/GLM-4-32B-0414-Z1-SLERP is a 32 billion parameter language model created by puchuneko, resulting from a SLERP merge of Zhipu AI's GLM-4-32B-0414 (instruct) and GLM-Z1-32B-0414 (reasoning) models. This experimental merge, weighted 70% instruct and 30% reasoning, significantly enhances mathematical reasoning, achieving 94.0% on GSM8K (5-shot) compared to its instruct parent's 88.3%. It is primarily designed to combine concise instruction following with improved reasoning capabilities, making it suitable for tasks requiring both clear answers and strong problem-solving. The model maintains a 32768 token context length.
Loading preview...
Overview
puchuneko/GLM-4-32B-0414-Z1-SLERP is an experimental 32 billion parameter model created by puchuneko, leveraging a SLERP merge of two Zhipu AI models: GLM-4-32B-0414 (instruct-tuned) and GLM-Z1-32B-0414 (reasoning-focused). The merge uses a t=0.3 parameter, meaning it's approximately 70% instruct and 30% reasoning.
Key Capabilities & Performance
- Enhanced Mathematical Reasoning: The primary differentiator is a notable improvement in mathematical problem-solving. On the GSM8K (5-shot) benchmark, this merged model scores 94.0%, a +5.7 percentage point gain over its instruct parent's 88.3%.
- Balanced Output: It aims to combine the concise answering style of the instruct parent with the improved reasoning skills of the reasoning parent.
- SLERP Method: This model was specifically created using the SLERP (Spherical Linear Interpolation) merge method, as other techniques like
dare_tiesortiesresulted in incoherent outputs for this instruct+reasoning pair.
Limitations & Considerations
- Limited Evaluation: Only GSM8K has been evaluated; general, safety, or Korean benchmarks are not available.
- Potential for Regression: The instruct parent's capabilities are diluted by 30%, which may lead to regression in other areas not yet tested.
- Experimental Artifact: This is presented as a transparent, reproducible experimental artifact rather than a production-ready or state-of-the-art (SOTA) model.
- No Korean Fine-tuning: The model does not include Korean fine-tuning, which is noted as future work.