Overview

watt-tool-8B is an 8 billion parameter language model, fine-tuned from LLaMa-3.1-8B-Instruct, with a context length of 32768 tokens. Its primary optimization is for complex tool usage and multi-turn dialogue scenarios. The model achieves state-of-the-art performance on the Berkeley Function-Calling Leaderboard (BFCL), highlighting its proficiency in function calling and tool orchestration.

Key Capabilities

Enhanced Tool Usage: Precisely selects and executes tools based on user requests.
Multi-Turn Dialogue: Maintains context and effectively utilizes tools across multiple conversational turns, enabling the completion of more intricate tasks.
High Performance: Demonstrated top-tier results on the BFCL, validating its capabilities in agentic workflows.

Training Methodology

The model was trained using supervised fine-tuning on a specialized dataset tailored for tool usage and multi-turn interactions. It incorporates CoT (Chain-of-Thought) techniques to synthesize high-quality multi-turn dialogue data. The training process is inspired by principles from the paper "Direct Multi-Turn Preference Optimization for Language Agents" and utilizes SFT (Supervised Fine-Tuning) and DMPO (Direct Multi-Turn Preference Optimization) to boost performance in multi-turn agent tasks.

Good For

Developing AI workflow building tools, similar to platforms like Lupan and Coze.
Applications requiring robust function calling and tool execution.
Building conversational agents that need to interact with external tools over extended dialogues.