Overview
watt-tool-70B is a 70 billion parameter language model developed by watt-ai, built upon the LLaMa-3.3-70B-Instruct base model. It is specifically engineered for advanced tool usage and multi-turn dialogue, demonstrating superior capabilities in understanding and executing complex tasks through external tools.
Key Capabilities
- Enhanced Tool Usage: The model is fine-tuned for precise and efficient selection and execution of tools, crucial for automating workflows.
- Multi-Turn Dialogue: It maintains context and effectively utilizes tools across multiple conversational turns, enabling the completion of more intricate tasks.
- State-of-the-Art Performance: Achieves top performance on the Berkeley Function-Calling Leaderboard (BFCL), validating its proficiency in function calling and tool integration.
- Foundation Model: Inherits robust language understanding and generation from its LLaMa-3.3-70B-Instruct base.
Training Methodology
The model was trained using supervised fine-tuning on a specialized dataset focused on tool usage and multi-turn interactions. This process incorporated CoT (Chain-of-Thought) techniques to synthesize high-quality multi-turn dialogue data. The training methodology is inspired by principles from the paper "Direct Multi-Turn Preference Optimization for Language Agents" and utilizes SFT (Supervised Fine-Tuning) and DMPO (Direct Multi-Turn Preference Optimization) to boost performance in multi-turn agent tasks.
Good For
- AI Workflow Building: Ideal for platforms requiring AI to build and manage complex workflows, such as Lupan.
- Function Calling Applications: Excellent for scenarios where precise and efficient function calling is critical.
- Multi-Turn Conversational Agents: Suitable for developing agents that need to interact with users and tools over extended dialogues.