hybridfree/GLM-Z1-9B-0414

TEXT GENERATIONConcurrency Cost:1Model Size:9BQuant:FP8Ctx Length:32kPublished:Apr 19, 2026License:mitArchitecture:Transformer Open Weights Cold

The hybridfree/GLM-Z1-9B-0414 is a 9 billion parameter open-source model from the GLM family, developed by THUDM. It is specifically designed for mathematical reasoning and general complex task solving, building upon the GLM-4-32B-0414 series with extended reinforcement learning. This model balances efficiency and effectiveness, offering excellent capabilities in resource-constrained scenarios.

Loading preview...

Model Overview

The hybridfree/GLM-Z1-9B-0414 is a 9 billion parameter model developed by THUDM, part of the GLM-4-32B-0414 series. It is a smaller-scale version that retains strong capabilities in mathematical reasoning and general tasks, making it suitable for lightweight deployment. This model was developed using techniques from its larger counterparts, including cold start and extended reinforcement learning, with further training on mathematics, code, and logic tasks.

Key Capabilities

  • Mathematical Reasoning: Significantly improved mathematical abilities compared to base models.
  • Complex Task Solving: Enhanced capability to solve complex problems through deep thinking.
  • General Reinforcement Learning: Benefits from general reinforcement learning based on pairwise ranking feedback, boosting overall capabilities.
  • Efficiency: Offers an excellent balance between efficiency and effectiveness for resource-constrained environments.

Usage Guidelines

  • Enforced Thinking: Supports an enforced thinking mechanism by adding <think>\n to the first line of input, or automatically via chat_template.jinja.
  • Long Contexts: Can handle contexts exceeding 8,192 tokens by enabling YaRN (Rope Scaling) in config.json for up to 32,768 tokens.
  • Sampling Parameters: Recommended temperature of 0.6 for balanced creativity and stability, top_p of 0.95, and max_new_tokens up to 30,000.

Good For

  • Applications requiring strong mathematical and logical reasoning.
  • Scenarios where efficient deployment and resource optimization are critical.
  • General complex task solving in a smaller model footprint.