Overview
watt-tool-8B is an 8 billion parameter language model, fine-tuned from LLaMa-3.1-8B-Instruct, with a context length of 32768 tokens. Its primary optimization is for complex tool usage and multi-turn dialogue scenarios. The model achieves state-of-the-art performance on the Berkeley Function-Calling Leaderboard (BFCL), highlighting its proficiency in function calling and tool orchestration.
Key Capabilities
- Enhanced Tool Usage: Precisely selects and executes tools based on user requests.
- Multi-Turn Dialogue: Maintains context and effectively utilizes tools across multiple conversational turns, enabling the completion of more intricate tasks.
- High Performance: Demonstrated top-tier results on the BFCL, validating its capabilities in agentic workflows.
Training Methodology
The model was trained using supervised fine-tuning on a specialized dataset tailored for tool usage and multi-turn interactions. It incorporates CoT (Chain-of-Thought) techniques to synthesize high-quality multi-turn dialogue data. The training process is inspired by principles from the paper "Direct Multi-Turn Preference Optimization for Language Agents" and utilizes SFT (Supervised Fine-Tuning) and DMPO (Direct Multi-Turn Preference Optimization) to boost performance in multi-turn agent tasks.
Good For
- Developing AI workflow building tools, similar to platforms like Lupan and Coze.
- Applications requiring robust function calling and tool execution.
- Building conversational agents that need to interact with external tools over extended dialogues.