Chat template kwargs
Pass model-specific chat template parameters to Featherless API requests.
Overview
chat_template_kwargs is an optional request-body field for passing model-specific chat template parameters to Featherless API requests.
Most users do not need this field. It is mainly useful for models whose chat templates expose extra controls, especially reasoning or "thinking" models where you may want to enable, disable, or budget reasoning.
Use chat_template_kwargs when you want to pass options that are not part of the standard OpenAI-compatible request body.
Where It Is Accepted
chat_template_kwargs can be included in request bodies for:
- POST /v1/chat/completions
- POST /v1/completions
- POST /debug/chat-format
- POST /models/{owner}/{model}/debug/chat-format
Example
{
"model": "Qwen/Qwen3-32B",
"messages": [
{
"role": "user",
"content": "Answer briefly: what is a Bloom filter?"
}
],
"chat_template_kwargs": {
"enable_thinking": false
}
}Supported Fields
Use enable_thinking to request thinking or reasoning behavior for models that support it.
{
"chat_template_kwargs": {
"enable_thinking": true
}
}Set it to false to request non-thinking/chat mode when the model supports that mode:
{
"chat_template_kwargs": {
"enable_thinking": false
}
}Not every model supports switching thinking on or off. If the model template does not use this option, it may have no effect.
Some model templates use the older name do_reasoning.
{
"chat_template_kwargs": {
"do_reasoning": false
}
}Use this only when the model's documentation or template expects do_reasoning.
Use thinking_budget to request a reasoning token budget for templates that support it.
{
"chat_template_kwargs": {
"thinking_budget": 1024
}
}This value is only meaningful for models that support a thinking or reasoning budget.
Custom Template Variables
chat_template_kwargs can also carry model-specific template variables.
For example, if a model template supports a custom variable like date_string, you can pass it like this:
{
"model": "meta-llama/Llama-3.1-8B-Instruct",
"messages": [
{
"role": "user",
"content": "What date is shown to the model?"
}
],
"chat_template_kwargs": {
"date_string": "25 May 2026"
}
}Unknown keys are accepted, but they only affect output if the selected model's chat template actually uses them.
How Featherless Applies These Values
When Featherless applies a chat template, values are applied in this order:
1. Featherless provides default template context values.
2. Your chat_template_kwargs are added.
3. Featherless applies required system values such as generation prompt and tool settings.
Your kwargs can override default context values, but they cannot override required system values.
Preview The Rendered Prompt
You can preview how a request will be formatted with the debug endpoint:
curl https://api.featherless.ai/models/Qwen/Qwen3-32B/debug/chat-format \
-H "Authorization: Bearer $FEATHERLESS_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "Qwen/Qwen3-32B",
"messages": [
{
"role": "user",
"content": "Answer briefly: what is a Bloom filter?"
}
],
"chat_template_kwargs": {
"enable_thinking": false
}
}The response includes:
- formatted_prompt: the rendered prompt text
- token_count: the prompt token count
- template_info: basic formatting metadata
Notes
- Most requests do not need chat_template_kwargs.
- Unknown keys are allowed, but unsupported keys may be ignored.
- thinking_budget only works for models whose templates support a reasoning budget.
- Some reasoning models may not support disabling thinking.
- For privacy, Featherless records only the names of chat_template_kwargs keys for operational visibility, not their values.