zai-org/GLM-4-32B-0414

TEXT GENERATIONConcurrency Cost:2Model Size:32BQuant:FP8Ctx Length:32kPublished:Apr 7, 2025License:mitArchitecture:Transformer0.5K Open Weights Cold

The GLM-4-32B-0414 is a 32 billion parameter model from the GLM family, pre-trained on 15T high-quality data including synthetic reasoning data. It excels in instruction following, engineering code generation, function calling, and agent tasks, with performance comparable to larger models like GPT-4o and DeepSeek-V3-0324 on specific benchmarks. This model is particularly optimized for complex reasoning and code-related applications, supporting user-friendly local deployment.

Loading preview...

GLM-4-32B-0414 Overview

The GLM-4-32B-0414 is a 32 billion parameter model from the GLM family, pre-trained on 15 trillion tokens of high-quality data, including substantial reasoning-type synthetic data. It has been enhanced through human preference alignment, rejection sampling, and reinforcement learning to improve instruction following, engineering code generation, and function calling. The model demonstrates strong performance in areas such as Artifact generation, search-based Q&A, and report generation, achieving results comparable to larger models like GPT-4o and DeepSeek-V3-0324 on specific code generation and Q&A benchmarks.

Key Capabilities

  • Advanced Reasoning: Features deep thinking capabilities, with specialized variants like GLM-Z1-32B-0414 for mathematics, code, and logic, and GLM-Z1-Rumination-32B-0414 for open-ended, complex problem-solving with search tool integration.
  • Code Generation & Function Calling: Excels in generating engineering code and supports external tool calls using a JSON-based format, demonstrated with examples for real-time API queries.
  • Multimodal Generation: Showcased capabilities in generating Python animations, web designs (HTML/CSS), and SVG images from natural language prompts.
  • Search-Based Writing: Designed to integrate with search results for generating detailed, analytical reports, leveraging RAG or WebSearch techniques.

Good For

  • Developers requiring robust code generation and function calling for agentic workflows.
  • Applications needing deep reasoning and complex problem-solving, especially in mathematical and logical domains.
  • Tasks involving search-augmented content creation and detailed report generation.
  • Scenarios where local deployment and efficient performance for a 32B parameter model are critical.

Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p