zai-org/GLM-4.6

5.0 based on 1 review
Warm
Public
357B
FP8
32768
4
Sep 29, 2025
License: mit
Hugging Face

zai-org/GLM-4.6 is a 357 billion parameter language model developed by zai-org, featuring an expanded 200K token context window. It demonstrates superior performance in coding benchmarks, advanced reasoning, and enhanced agentic capabilities, including tool use and search-based agents. The model also offers refined writing that aligns with human preferences and excels in role-playing scenarios. GLM-4.6 is designed for complex agentic tasks, robust coding applications, and sophisticated reasoning challenges.

Overview

GLM-4.6: An Enhanced Large Language Model

GLM-4.6 is an advanced large language model developed by zai-org, building upon its predecessor GLM-4.5 with significant improvements across several key areas. This model features a substantial 357 billion parameters and an expanded 200K token context window, enabling it to handle more intricate and demanding tasks.

Key Capabilities and Improvements

  • Extended Context Window: The context window has been significantly expanded from 128K to 200K tokens, facilitating more complex agentic workflows and longer interactions.
  • Superior Coding Performance: GLM-4.6 achieves higher scores on various code benchmarks and demonstrates enhanced real-world performance in applications like Claude Code, Cline, Roo Code, and Kilo Code, including generating visually polished front-end pages.
  • Advanced Reasoning: The model shows clear improvements in reasoning capabilities and supports tool use during inference, contributing to stronger overall problem-solving.
  • More Capable Agents: It exhibits stronger performance in tool-using and search-based agents, integrating more effectively within agent frameworks.
  • Refined Writing: GLM-4.6 produces text that better aligns with human preferences in style and readability, performing more naturally in role-playing scenarios.

Evaluations across eight public benchmarks covering agents, reasoning, and coding indicate clear gains over GLM-4.5, positioning GLM-4.6 competitively against leading models such as DeepSeek-V3.1-Terminus and Claude Sonnet 4.

Recommended Usage

For general evaluations, a sampling temperature of 1.0 is recommended. For code-related evaluation tasks, specific parameters like top_p = 0.95 and top_k = 40 are suggested for optimal performance.