LGAI-EXAONE/EXAONE-4.0.1-32B

TEXT GENERATIONConcurrency Cost:2Model Size:32BQuant:FP8Ctx Length:32kPublished:Jul 29, 2025License:exaoneArchitecture:Transformer0.0K Cold

LGAI-EXAONE/EXAONE-4.0.1-32B is a 32 billion parameter large language model developed by LG AI Research, integrating both non-reasoning and reasoning modes. It features a hybrid attention scheme and QK-Reorder-Norm for enhanced performance, supporting agentic tool use and multilingual capabilities in English, Korean, and Spanish. This model is optimized for high performance across complex reasoning tasks, instruction following, and agentic applications.

Loading preview...

EXAONE 4.0.1-32B Overview

EXAONE 4.0.1-32B is a 32 billion parameter model from LG AI Research, designed to combine the usability of EXAONE 3.5 with the advanced reasoning of EXAONE Deep. This version is a patch to reduce unintended responses. It introduces a unique architecture featuring a Hybrid Attention scheme, which blends local (sliding window) and global (full) attention, and QK-Reorder-Norm for improved performance on downstream tasks. The model supports a context length of 131,072 tokens and a vocabulary size of 102,400.

Key Capabilities

  • Hybrid Reasoning Modes: Seamlessly switches between a general non-reasoning mode and a dedicated reasoning mode for complex problem-solving, activated via enable_thinking=True in the tokenizer.
  • Agentic Tool Use: Capable of functioning as an agent with tool-calling capabilities, allowing integration with external functions (e.g., roll_dice).
  • Multilingual Support: Extends capabilities to English, Korean, and Spanish, with specific benchmarks for Korean (KMMLU-Pro, KSM) and Spanish (MMMLU).
  • Architectural Innovations: Incorporates hybrid attention and QK-Reorder-Norm for enhanced context understanding and task performance.

Performance Highlights

EXAONE 4.0.1-32B demonstrates strong performance in both reasoning and non-reasoning modes across various benchmarks. In reasoning mode, it shows competitive results in World Knowledge (MMLU-Pro, GPQA-Diamond), Math/Coding (AIME 2025, LiveCodeBench v6), Instruction Following (IFEval), and Agentic Tool Use (BFCL-v3, Tau-Bench). For optimal performance, specific sampling parameters are recommended, such as temperature<0.6 for non-reasoning and temperature=0.6, top_p=0.95 for reasoning mode.

Good for

  • Applications requiring robust reasoning and problem-solving.
  • Developing AI agents with tool-use functionalities.
  • Multilingual applications, particularly in English, Korean, and Spanish.
  • High-performance tasks benefiting from a 32B parameter model with a large context window.