Tool Calling

How to use tool and function calling

Function calling enables AI models to interact with external systems and execute specific tasks during conversations. Rather than calling functions directly, the model identifies which functions to invoke and provides the necessary parameters. Your application then executes these functions and returns the results to the model, which incorporates them into a comprehensive response.

Featherless.ai offers seamless OpenAI-compatible function calling that works across our supported models, simplifying the integration of external capabilities into your AI-powered applications. This means you can use the same function calling syntax you're familiar with from OpenAI's API, making migration and integration straightforward.

Supported Models

Currently, function calling is natively supported on the following model families:

These models have been specifically trained to understand function schemas and generate appropriate function calls based on user queries. For models not on this list, see our workaround approach at the end of this guide.

How Function Calling Works

Function calling with Featherless.ai follows a structured conversation flow that ensures reliable execution of external tools. The process involves three main phases: sending the initial request with available tools, receiving the model's function call decisions, and executing those functions before getting the final response.

Request Structure Overview

The entire function calling process follows a three-step pattern that maintains conversation context while executing external operations. Here's how each phase works:

Step 1: Initial Request with Tools

First, you send your user's message along with definitions of the functions available to the model. The model analyzes the user's intent and determines which functions, if any, would help answer the query.

POST https://api.featherless.ai/v1/chat/completions
Content-Type: application/json
Authorization: Bearer your-api-key

{
  "model": "Qwen/Qwen3-32B",
  "messages": [
    {
      "role": "user", 
      "content": "What's the weather like in San Francisco?"
    }
  ],
  "tools": [
    {
      "type": "function",
      "function": {
        "name": "get_current_weather",
        "description": "Get the current weather in a given location",
        "parameters": {
          "type": "object",
          "properties": {
            "location": {
              "type": "string",
              "description": "The city and state, e.g. San Francisco, CA"
            }
          },
          "required": ["location"]
        }
      }
    }
  ]
}

The tools array contains all functions the model can choose to call. Each function definition includes its name, description, and a JSON Schema defining its parameters. The model uses this information to understand when and how to use each function.

Step 2: Model Response with Tool Calls

When the model determines that function calls are needed, it responds with structured tool call requests instead of a text response. This allows your application to intercept and execute the necessary functions.

HTTP/1.1 200 OK
Content-Type: application/json

{
  "choices": [
    {
      "message": {
        "role": "assistant",
        "tool_calls": [
          {
            "id": "call_abc123",
            "type": "function",
            "function": {
              "name": "get_current_weather",
              "arguments": "{\"location\": \"San Francisco, CA\"}"
            }
          }
        ]
      }
    }
  ]
}

Each tool call includes a unique ID (which you'll need to reference when returning results) and the function details with properly formatted arguments. The arguments are provided as a JSON string that matches the schema you defined.

Step 3: Execute and Return Results

After executing the functions locally, you send the results back to the model as part of the conversation history. The model then incorporates these results into its final response to the user.

Python Implementation Example

Let's walk through a complete implementation that demonstrates how to integrate function calling into your Python application. This example shows how to handle weather queries and geometric calculations using Featherless.ai's function calling capabilities.

Basic Setup

First, set up the foundation for making API requests to Featherless.ai. This includes configuring your API credentials and creating a reusable function for chat completions:

A chat completion request to the API

import requests
import json
import math

# Featherless API configuration
API_URL = "https://api.featherless.ai/v1/chat/completions"
API_KEY = "your-api-key-here"
MODEL = "Qwen/Qwen3-32B"

def chat_completion_request(messages, tools=None):
    """Make a chat completion request to the API."""
    headers = {
        "Content-Type": "application/json",
        "Authorization": f"Bearer {API_KEY}"
    }
    
    json_data = {
        "model": MODEL,
        "messages": messages,
        "max_tokens": 4096,
    }
    
    if tools is not None:
        json_data["tools"] = tools
    
    response = requests.post(API_URL, headers=headers, json=json_data)
    response.raise_for_status()
    return response.json()

This function handles both regular chat completions and those with function calling. The tools parameter is optional, allowing you to use the same function for different types of requests.

Define Your Functions

Next, implement the actual functions that will be called. These are regular Python functions that perform the operations your AI assistant needs access to:

Functions to call

def get_current_weather(location: str, unit: str = "fahrenheit") -> dict:
    """Get the current weather in a given location."""
    # This is a mock function - in real use, you'd call a weather API
    result = {
        "location": location,
        "temperature": "72",
        "unit": unit,
        "forecast": ["sunny", "windy"]
    }
    return result

def calculate_area(shape: str, **kwargs) -> dict:
    """Calculate the area of a geometric shape."""
    if shape == "circle":
        radius = kwargs.get("radius", 0)
        area = math.pi * radius ** 2
        result = {"shape": shape, "area": round(area, 2)}
    elif shape == "rectangle":
        length = kwargs.get("length", 0)
        width = kwargs.get("width", 0)
        area = length * width
        result = {"shape": shape, "area": area}
    else:
        result = {"error": f"Unknown shape: {shape}"}
    
    return result

# Available functions mapping
AVAILABLE_FUNCTIONS = {
    "get_current_weather": get_current_weather,
    "calculate_area": calculate_area
}

The AVAILABLE_FUNCTIONS dictionary maps function names to their implementations, making it easy to dynamically execute functions based on the model's requests. In production, these functions would connect to real APIs, databases, or other services.

Tool Definitions

Now define the function schemas that tell the model how to use your functions. These definitions follow the OpenAI function calling format:

Tool Definitions

TOOLS = [
    {
        "type": "function",
        "function": {
            "name": "get_current_weather",
            "description": "Get the current weather in a given location",
            "parameters": {
                "type": "object",
                "properties": {
                    "location": {
                        "type": "string",
                        "description": "The city and state, e.g. San Francisco, CA"
                    },
                    "unit": {
                        "type": "string",
                        "enum": ["celsius", "fahrenheit"],
                        "description": "The unit of temperature"
                    }
                },
                "required": ["location"]
            }
        }
    },
    {
        "type": "function",
        "function": {
            "name": "calculate_area", 
            "description": "Calculate the area of a geometric shape",
            "parameters": {
                "type": "object",
                "properties": {
                    "shape": {
                        "type": "string",
                        "enum": ["circle", "rectangle"],
                        "description": "The shape to calculate area for"
                    },
                    "radius": {
                        "type": "number",
                        "description": "Radius for circle"
                    },
                    "length": {
                        "type": "number", 
                        "description": "Length for rectangle"
                    },
                    "width": {
                        "type": "number",
                        "description": "Width for rectangle"
                    }
                },
                "required": ["shape"]
            }
        }
    }
]

Clear, descriptive function definitions help the model understand when and how to use each tool. The enum fields restrict values to valid options, while required fields ensure essential parameters are always provided.

Execute Function Calls

Create a helper function to safely execute the functions requested by the model:

Calling the functions

def execute_function_call(function_call):
    """Execute a function call and return the result."""
    function_name = function_call["name"]
    function_args = json.loads(function_call["arguments"])
    
    if function_name in AVAILABLE_FUNCTIONS:
        function_to_call = AVAILABLE_FUNCTIONS[function_name]
        result = function_to_call(**function_args)
        return result
    else:
        return {"error": f"Function {function_name} not found"}

This function parses the model's JSON arguments and calls the appropriate Python function. Error handling ensures that unknown functions don't crash your application.

Complete Conversation Flow

Finally, tie everything together in a conversation handler that manages the full function calling lifecycle:

Run a conversation with function calling

def run_conversation():    
    # Step 1: Send initial request with tools
    messages = [
        {
            "role": "user",
            "content": "What's the weather like in San Francisco and what's the area of a circle with radius 5?"
        }
    ]
    
    response = chat_completion_request(messages, tools=TOOLS)
    response_message = response["choices"][0]["message"]
    tool_calls = response_message.get("tool_calls")
    
    # Step 2: Check if the model wanted to call functions
    if tool_calls:
        # Add the assistant's response to the conversation
        messages.append(response_message)
        
        # Step 3: Execute the function calls
        for tool_call in tool_calls:
            function_response = execute_function_call(tool_call["function"])
            
            # Add function response to the conversation
            messages.append({
                "tool_call_id": tool_call["id"],
                "role": "tool",
                "name": tool_call["function"]["name"],
                "content": json.dumps(function_response)
            })
        
        # Step 4: Get final response from model
        second_response = chat_completion_request(messages)
        final_message = second_response["choices"][0]["message"]["content"]
        print(f"Assistant: {final_message}")
    else:
        # Model responded without calling functions
        print(f"Assistant: {response_message['content']}")

if __name__ == "__main__":
    run_conversation()

This function orchestrates the entire process: sending the initial request, executing any requested functions, and obtaining the final response. The conversation history is maintained throughout, ensuring the model has full context when generating its final answer.

Handling Multiple Function Calls

The model can request multiple function calls in a single response, which is particularly useful for complex queries. Our example handles this automatically by iterating through all tool calls. The model intelligently batches related function calls for efficiency.

Best Practices

When implementing function calling:

Keep function descriptions clear and concise - The model relies on these to understand when to use each function
Use descriptive parameter names - This helps the model provide correct arguments
Handle errors gracefully - Always validate function arguments and return meaningful error messages
Maintain conversation context - Include all function calls and responses in the message history
Set appropriate timeouts - External API calls might take time; plan accordingly

Working with Unsupported Models

Not all models natively support function calling, but you can still achieve similar functionality through careful prompting. For models that don't appear in our supported list, we provide alternative approaches in our Featherless cookbook.

These notebooks demonstrate how to simulate function calling using the OpenAI Python SDK with any model that can follow instructions and output JSON. The key technique involves using the response_format={"type": "json_object"} parameter to ensure valid JSON output, combined with system prompts that instruct the model to output the required JSON structure.

This approach works well for models with strong instruction-following capabilities, though it may require more careful prompt engineering compared to native function calling support.

Featherless Cookbook

This notebook demonstrates how to simulate function (tool) calling with Llama models using the OpenAI Python SDK.
This is an alternative to OpenAI's built-in function calling, and works with any model that can follow instructions and output JSON.

Last edited: Aug 7, 2025