dongguanting/Tool-Star-Qwen-7B

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:7.6BQuant:FP8Ctx Length:32kPublished:Jun 30, 2025License:mitArchitecture:Transformer0.0K Open Weights Warm

Tool-Star-Qwen-7B is a Qwen2.5-3B-Instruct based model developed by dongguanting, specifically trained using the Tool-Star framework. This model is designed to empower large language models with multi-tool collaborative reasoning capabilities. It excels at autonomously invoking multiple external tools during stepwise reasoning, making it suitable for complex problem-solving tasks requiring tool integration.

Loading preview...

Overview

Tool-Star-Qwen-7B is a specialized language model built upon the Qwen2.5-3B-Instruct architecture, developed by dongguanting. It is the official checkpoint trained using the innovative Tool-Star framework, which focuses on enhancing multi-tool collaborative reasoning in LLMs. The model's development is detailed in the paper "Tool-Star: Empowering LLM-Brained Multi-Tool Reasoner via Reinforcement Learning" (Huggingface Paper).

Key Capabilities

  • Multi-Tool Collaborative Reasoning: Designed to autonomously invoke and integrate multiple external tools during complex, stepwise reasoning processes.
  • Reinforcement Learning (RL) Framework: Utilizes an RL-based approach to empower LLMs with effective tool-use strategies.
  • Data Synthesis Pipeline: Incorporates a novel data synthesis pipeline combining tool-integrated prompting with hint-based sampling to generate high-quality, scalable tool-use trajectories.
  • Two-Stage Training: Employs a cold-start fine-tuning phase to explore reasoning patterns and a multi-tool self-critic RL algorithm with hierarchical reward design for enhanced tool collaboration.
  • Broad Tool Integration: Integrates six distinct types of tools to facilitate diverse problem-solving scenarios.

Good For

  • Complex Reasoning Tasks: Ideal for applications requiring LLMs to break down problems and utilize external tools for solutions.
  • Automated Tool Invocation: Suitable for scenarios where LLMs need to intelligently select and use various tools without explicit human guidance.
  • Research in LLM Tool Use: Provides a strong baseline and framework for further research into multi-tool reasoning and RL-driven LLM enhancements.