Name: distillabs/tft-benchmark-s1-tft-Qwen3-1.7B API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: distillabs

Model Overview

The distillabs/tft-benchmark-s1-tft-Qwen3-1.7B is a 1.7 billion parameter Qwen3 model developed by Distil Labs. It has been fine-tuned for multi-turn tool calling, a critical capability for conversational AI agents. This model is a component of the TFT (Training from Traces) Benchmark, which evaluates different approaches to training Small Language Models (SLMs) from production traces.

Key Capabilities

Multi-turn Tool Calling: Specialized in understanding and executing tool-use instructions across multiple conversational turns.
TFT Pipeline Training: Utilizes a sophisticated training pipeline involving trace filtering, committee relabeling by multiple LLMs, and synthetic data generation to enhance performance.
Benchmark Performance: Achieved an LLM-as-a-judge score of 0.866 and a staged_tool_call score of 0.765 in the S1 Baseline scenario of the TFT benchmark, which uses clean production traces.
Target Tools: Proficient in using tools for restaurant search (FindRestaurants) and reservation (ReserveRestaurant), based on the Schema-Guided Dialogue (SGD) dataset.

Good For

Developing Tool-Calling Agents: Ideal for applications requiring an SLM to accurately interpret and execute tool functions in multi-turn dialogues.
Benchmarking Tool-Use Performance: Serves as a strong baseline model for evaluating tool-calling capabilities, particularly when trained with the TFT pipeline.
Research into SLM Training: Useful for researchers exploring advanced fine-tuning techniques for SLMs using production traces and synthetic data generation.

Overview

Model Overview

Key Capabilities

Good For

Full Model Card (README)