uiuc-convai/CoALM-70B

TEXT GENERATIONConcurrency Cost:4Model Size:70BQuant:FP8Ctx Length:32kPublished:Feb 3, 2025License:cc-by-nc-4.0Architecture:Transformer0.0K Open Weights Cold

CoALM-70B is a 70 billion parameter Conversational Agentic Language Model developed by UIUC Conversational AI LAB and Oumi. Fine-tuned from Llama 3.3 70B Instruct, it integrates Task-Oriented Dialogue (TOD) capabilities with Language Agent (LA) functionalities. The model excels at multi-turn reasoning, complex API usage, and function calling, achieving strong performance across TOD and function-calling benchmarks like MultiWOZ 2.4, BFCL V3, and API-Bank.

Loading preview...

CoALM-70B: Conversational Agentic Language Model

CoALM-70B is a 70 billion parameter model developed by the UIUC Conversational AI LAB and Oumi, building upon the Llama 3.3 70B Instruct architecture. It is specifically designed to unify Task-Oriented Dialogue (TOD) and Language Agent (LA) functionalities, enabling advanced conversational AI with tool use.

Key Capabilities & Features

  • Multi-turn Dialogue Mastery: Handles complex, long-running conversations with accurate state tracking.
  • Advanced Function Calling: Dynamically selects and executes API calls for task completion, demonstrating strong zero-shot generalization.
  • Enhanced ReAct-based Reasoning: Integrates structured reasoning (User-Thought-Action-Observation-Thought-Response) for robust multi-turn interactions with API integrations.
  • Benchmark Performance: Achieves strong results on key conversational evaluation benchmarks, including MultiWOZ 2.4 (TOD), BFCL V3 (LA), and API-Bank (LA), surpassing some proprietary models.

Training & Data

The model was fine-tuned using the CoALM-IT dataset, a multi-task dataset interleaving multi-turn ReAct reasoning with complex API usage. The training process involved distinct stages for TOD, function calling, and ReAct-based fine-tuning, utilizing 8 NVIDIA H100 GPUs for approximately 24 hours.

Good For

  • Developing sophisticated conversational agents requiring both dialogue management and external tool/API interaction.
  • Applications needing robust multi-turn reasoning and function-calling capabilities.
  • Research and development in unified conversational AI and language agents.