watt-ai/watt-tool-70B
watt-tool-70B is a 70 billion parameter language model developed by watt-ai, fine-tuned from LLaMa-3.3-70B-Instruct. It is specifically optimized for complex tool usage and multi-turn dialogue scenarios, achieving state-of-the-art performance on the Berkeley Function-Calling Leaderboard. This model excels at understanding user requests, selecting appropriate tools, and executing them across multiple conversational turns, making it ideal for AI workflow building platforms.
Loading preview...
Overview
watt-tool-70B is a 70 billion parameter language model developed by watt-ai, built upon the LLaMa-3.3-70B-Instruct base model. It is specifically engineered for advanced tool usage and multi-turn dialogue, demonstrating superior capabilities in understanding and executing complex tasks through external tools.
Key Capabilities
- Enhanced Tool Usage: The model is fine-tuned for precise and efficient selection and execution of tools, crucial for automating workflows.
- Multi-Turn Dialogue: It maintains context and effectively utilizes tools across multiple conversational turns, enabling the completion of more intricate tasks.
- State-of-the-Art Performance: Achieves top performance on the Berkeley Function-Calling Leaderboard (BFCL), validating its proficiency in function calling and tool integration.
- Foundation Model: Inherits robust language understanding and generation from its LLaMa-3.3-70B-Instruct base.
Training Methodology
The model was trained using supervised fine-tuning on a specialized dataset focused on tool usage and multi-turn interactions. This process incorporated CoT (Chain-of-Thought) techniques to synthesize high-quality multi-turn dialogue data. The training methodology is inspired by principles from the paper "Direct Multi-Turn Preference Optimization for Language Agents" and utilizes SFT (Supervised Fine-Tuning) and DMPO (Direct Multi-Turn Preference Optimization) to boost performance in multi-turn agent tasks.
Good For
- AI Workflow Building: Ideal for platforms requiring AI to build and manage complex workflows, such as Lupan.
- Function Calling Applications: Excellent for scenarios where precise and efficient function calling is critical.
- Multi-Turn Conversational Agents: Suitable for developing agents that need to interact with users and tools over extended dialogues.