Tool Calling
How to use tool and function calling
Function calling enables AI models to interact with external systems and execute specific tasks during conversations. Rather than calling functions directly, the model identifies which functions to invoke and provides the necessary parameters. Your application then executes these functions and returns the results to the model, which incorporates them into a comprehensive response.
Featherless.ai offers seamless OpenAI-compatible function calling that works across our supported models, simplifying the integration of external capabilities into your AI-powered applications. This means you can use the same function calling syntax you're familiar with from OpenAI's API, making migration and integration straightforward.
Supported Models
Currently, function calling is natively supported on the following model families:
These models have been specifically trained to understand function schemas and generate appropriate function calls based on user queries. For models not on this list, see our workaround approach at the end of this guide.
How Function Calling Works
Function calling with Featherless.ai follows a structured conversation flow that ensures reliable execution of external tools. The process involves three main phases: sending the initial request with available tools, receiving the model's function call decisions, and executing those functions before getting the final response.
Request Structure Overview
The entire function calling process follows a three-step pattern that maintains conversation context while executing external operations. Here's how each phase works:
Step 1: Initial Request with Tools
First, you send your user's message along with definitions of the functions available to the model. The model analyzes the user's intent and determines which functions, if any, would help answer the query.
POST https://api.featherless.ai/v1/chat/completions
Content-Type: application/json
Authorization: Bearer your-api-key
{
"model": "Qwen/Qwen3-32B",
"messages": [
{
"role": "user",
"content": "What's the weather like in San Francisco?"
}
],
"tools": [
{
"type": "function",
"function": {
"name": "get_current_weather",
"description": "Get the current weather in a given location",
"parameters": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "The city and state, e.g. San Francisco, CA"
}
},
"required": ["location"]
}
}
}
]
}
The tools
array contains all functions the model can choose to call. Each function definition includes its name, description, and a JSON Schema defining its parameters. The model uses this information to understand when and how to use each function.
Step 2: Model Response with Tool Calls
When the model determines that function calls are needed, it responds with structured tool call requests instead of a text response. This allows your application to intercept and execute the necessary functions.
HTTP/1.1 200 OK
Content-Type: application/json
{
"choices": [
{
"message": {
"role": "assistant",
"tool_calls": [
{
"id": "call_abc123",
"type": "function",
"function": {
"name": "get_current_weather",
"arguments": "{\"location\": \"San Francisco, CA\"}"
}
}
]
}
}
]
}
Each tool call includes a unique ID (which you'll need to reference when returning results) and the function details with properly formatted arguments. The arguments are provided as a JSON string that matches the schema you defined.
Step 3: Execute and Return Results
After executing the functions locally, you send the results back to the model as part of the conversation history. The model then incorporates these results into its final response to the user.
Python Implementation Example
Let's walk through a complete implementation that demonstrates how to integrate function calling into your Python application. This example shows how to handle weather queries and geometric calculations using Featherless.ai's function calling capabilities.
Basic Setup
First, set up the foundation for making API requests to Featherless.ai. This includes configuring your API credentials and creating a reusable function for chat completions:
import requests
import json
import math
# Featherless API configuration
API_URL = "https://api.featherless.ai/v1/chat/completions"
API_KEY = "your-api-key-here"
MODEL = "Qwen/Qwen3-32B"
def chat_completion_request(messages, tools=None):
"""Make a chat completion request to the API."""
headers = {
"Content-Type": "application/json",
"Authorization": f"Bearer {API_KEY}"
}
json_data = {
"model": MODEL,
"messages": messages,
"max_tokens": 4096,
}
if tools is not None:
json_data["tools"] = tools
response = requests.post(API_URL, headers=headers, json=json_data)
response.raise_for_status()
return response.json()
This function handles both regular chat completions and those with function calling. The tools
parameter is optional, allowing you to use the same function for different types of requests.
Define Your Functions
Next, implement the actual functions that will be called. These are regular Python functions that perform the operations your AI assistant needs access to:
def get_current_weather(location: str, unit: str = "fahrenheit") -> dict:
"""Get the current weather in a given location."""
# This is a mock function - in real use, you'd call a weather API
result = {
"location": location,
"temperature": "72",
"unit": unit,
"forecast": ["sunny", "windy"]
}
return result
def calculate_area(shape: str, **kwargs) -> dict:
"""Calculate the area of a geometric shape."""
if shape == "circle":
radius = kwargs.get("radius", 0)
area = math.pi * radius ** 2
result = {"shape": shape, "area": round(area, 2)}
elif shape == "rectangle":
length = kwargs.get("length", 0)
width = kwargs.get("width", 0)
area = length * width
result = {"shape": shape, "area": area}
else:
result = {"error": f"Unknown shape: {shape}"}
return result
# Available functions mapping
AVAILABLE_FUNCTIONS = {
"get_current_weather": get_current_weather,
"calculate_area": calculate_area
}
The AVAILABLE_FUNCTIONS
dictionary maps function names to their implementations, making it easy to dynamically execute functions based on the model's requests. In production, these functions would connect to real APIs, databases, or other services.
Tool Definitions
Now define the function schemas that tell the model how to use your functions. These definitions follow the OpenAI function calling format:
TOOLS = [
{
"type": "function",
"function": {
"name": "get_current_weather",
"description": "Get the current weather in a given location",
"parameters": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "The city and state, e.g. San Francisco, CA"
},
"unit": {
"type": "string",
"enum": ["celsius", "fahrenheit"],
"description": "The unit of temperature"
}
},
"required": ["location"]
}
}
},
{
"type": "function",
"function": {
"name": "calculate_area",
"description": "Calculate the area of a geometric shape",
"parameters": {
"type": "object",
"properties": {
"shape": {
"type": "string",
"enum": ["circle", "rectangle"],
"description": "The shape to calculate area for"
},
"radius": {
"type": "number",
"description": "Radius for circle"
},
"length": {
"type": "number",
"description": "Length for rectangle"
},
"width": {
"type": "number",
"description": "Width for rectangle"
}
},
"required": ["shape"]
}
}
}
]
Clear, descriptive function definitions help the model understand when and how to use each tool. The enum
fields restrict values to valid options, while required
fields ensure essential parameters are always provided.
Execute Function Calls
Create a helper function to safely execute the functions requested by the model:
def execute_function_call(function_call):
"""Execute a function call and return the result."""
function_name = function_call["name"]
function_args = json.loads(function_call["arguments"])
if function_name in AVAILABLE_FUNCTIONS:
function_to_call = AVAILABLE_FUNCTIONS[function_name]
result = function_to_call(**function_args)
return result
else:
return {"error": f"Function {function_name} not found"}
This function parses the model's JSON arguments and calls the appropriate Python function. Error handling ensures that unknown functions don't crash your application.
Complete Conversation Flow
Finally, tie everything together in a conversation handler that manages the full function calling lifecycle:
def run_conversation():
# Step 1: Send initial request with tools
messages = [
{
"role": "user",
"content": "What's the weather like in San Francisco and what's the area of a circle with radius 5?"
}
]
response = chat_completion_request(messages, tools=TOOLS)
response_message = response["choices"][0]["message"]
tool_calls = response_message.get("tool_calls")
# Step 2: Check if the model wanted to call functions
if tool_calls:
# Add the assistant's response to the conversation
messages.append(response_message)
# Step 3: Execute the function calls
for tool_call in tool_calls:
function_response = execute_function_call(tool_call["function"])
# Add function response to the conversation
messages.append({
"tool_call_id": tool_call["id"],
"role": "tool",
"name": tool_call["function"]["name"],
"content": json.dumps(function_response)
})
# Step 4: Get final response from model
second_response = chat_completion_request(messages)
final_message = second_response["choices"][0]["message"]["content"]
print(f"Assistant: {final_message}")
else:
# Model responded without calling functions
print(f"Assistant: {response_message['content']}")
if __name__ == "__main__":
run_conversation()
This function orchestrates the entire process: sending the initial request, executing any requested functions, and obtaining the final response. The conversation history is maintained throughout, ensuring the model has full context when generating its final answer.
Handling Multiple Function Calls
The model can request multiple function calls in a single response, which is particularly useful for complex queries. Our example handles this automatically by iterating through all tool calls. The model intelligently batches related function calls for efficiency.
Best Practices
When implementing function calling:
Keep function descriptions clear and concise - The model relies on these to understand when to use each function
Use descriptive parameter names - This helps the model provide correct arguments
Handle errors gracefully - Always validate function arguments and return meaningful error messages
Maintain conversation context - Include all function calls and responses in the message history
Set appropriate timeouts - External API calls might take time; plan accordingly
Working with Unsupported Models
Not all models natively support function calling, but you can still achieve similar functionality through careful prompting. For models that don't appear in our supported list, we provide alternative approaches in our Featherless cookbook.
These notebooks demonstrate how to simulate function calling using the OpenAI Python SDK with any model that can follow instructions and output JSON. The key technique involves using the response_format={"type": "json_object"}
parameter to ensure valid JSON output, combined with system prompts that instruct the model to output the required JSON structure.
This approach works well for models with strong instruction-following capabilities, though it may require more careful prompt engineering compared to native function calling support.