katanemo/Arch-Agent-32B

Hugging Face
TEXT GENERATIONConcurrency Cost:2Model Size:32.8BQuant:FP8Ctx Length:32kPublished:Jun 20, 2025License:katanemo-researchArchitecture:Transformer0.0K Warm

Arch-Agent-32B by katanemo is a 32.8 billion parameter large language model specifically designed for advanced function calling and agent-based applications, featuring a 131,072 token context length. It excels at multi-turn and multi-step function calling, enabling complex workflows that require intelligent tool selection and adaptive planning. This model is optimized for seamless integration with external APIs and services, delivering leading performance in intricate agentic scenarios.

Loading preview...

Overview

Arch-Agent-32B is a 32.8 billion parameter large language model developed by katanemo, specifically engineered for advanced function calling and agent-based applications. It is designed to manage sophisticated multi-step and multi-turn workflows, making it highly effective for tasks requiring intelligent tool selection, adaptive planning, and integration with external APIs.

Key Capabilities

  • Multi-Turn Function Calling: Maintains context across multiple dialogue turns for evolving tool use.
  • Multi-Step Function Calling: Plans and executes sequences of function calls, dynamically adapting based on intermediate results and decomposing complex goals.
  • Agentic Capabilities: Provides advanced decision-making and workflow management for complex agentic tasks, including seamless tool coordination and error recovery.

Performance

Arch-Agent-32B demonstrates leading performance on the Berkeley Function-Calling Leaderboard (BFCL), evaluated with a 64K context length using YaRN scaling for multi-turn scenarios. The model is built to deliver reliability and precision across extended function call sequences.

Usage

The model is compatible with the Hugging Face transformers library (version >=4.51.0) and utilizes a specific prompt format to extract JSON output similar to OpenAI's function calling. Example code is provided for quick integration into function calling tasks.