Name: dongguanting/Tool-Star-Qwen-3B API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: dongguanting

Overview

Tool-Star-Qwen-3B is a 3.1 billion parameter language model developed by dongguanting, built upon the Qwen2.5-3B-Instruct base model. It is specifically trained using the novel Tool-Star framework, which focuses on enhancing the model's ability to perform multi-tool collaborative reasoning. This framework integrates six types of external tools and employs a two-stage training process, including cold-start fine-tuning and a multi-tool self-critic reinforcement learning (RL) algorithm with hierarchical reward design.

Key Capabilities

Autonomous Tool Invocation: Designed to autonomously invoke and integrate multiple external tools during complex reasoning tasks.
Reinforcement Learning for Reasoning: Leverages an RL-based framework to empower effective multi-tool collaborative reasoning.
Data Synthesis Pipeline: Utilizes a general tool-integrated reasoning data synthesis pipeline, combining tool-integrated prompting with hint-based sampling to generate tool-use trajectories.
Enhanced Collaboration: Employs a two-stage training framework to improve multi-tool collaboration, guided by tool-invocation feedback and reinforced by a multi-tool self-critic RL algorithm.

Good For

Complex Problem Solving: Ideal for applications requiring the model to break down problems and utilize external tools for intermediate steps.
Research in Tool-Augmented LLMs: A valuable resource for researchers exploring reinforcement learning and multi-tool integration in large language models.
Reasoning Benchmarks: Demonstrated effectiveness across over 10 challenging reasoning benchmarks, indicating strong performance in tasks requiring logical deduction and external resource utilization.

Overview

Overview

Key Capabilities

Good For

Full Model Card (README)