watt-ai/watt-tool-70B

Warm
Public
70B
FP8
32768
License: apache-2.0
Hugging Face
Overview

Overview

watt-tool-70B is a 70 billion parameter language model developed by watt-ai, built upon the LLaMa-3.3-70B-Instruct base model. It is specifically engineered for advanced tool usage and multi-turn dialogue, demonstrating superior capabilities in understanding and executing complex tasks through external tools.

Key Capabilities

  • Enhanced Tool Usage: The model is fine-tuned for precise and efficient selection and execution of tools, crucial for automating workflows.
  • Multi-Turn Dialogue: It maintains context and effectively utilizes tools across multiple conversational turns, enabling the completion of more intricate tasks.
  • State-of-the-Art Performance: Achieves top performance on the Berkeley Function-Calling Leaderboard (BFCL), validating its proficiency in function calling and tool integration.
  • Foundation Model: Inherits robust language understanding and generation from its LLaMa-3.3-70B-Instruct base.

Training Methodology

The model was trained using supervised fine-tuning on a specialized dataset focused on tool usage and multi-turn interactions. This process incorporated CoT (Chain-of-Thought) techniques to synthesize high-quality multi-turn dialogue data. The training methodology is inspired by principles from the paper "Direct Multi-Turn Preference Optimization for Language Agents" and utilizes SFT (Supervised Fine-Tuning) and DMPO (Direct Multi-Turn Preference Optimization) to boost performance in multi-turn agent tasks.

Good For

  • AI Workflow Building: Ideal for platforms requiring AI to build and manage complex workflows, such as Lupan.
  • Function Calling Applications: Excellent for scenarios where precise and efficient function calling is critical.
  • Multi-Turn Conversational Agents: Suitable for developing agents that need to interact with users and tools over extended dialogues.