Docs /Getting Started//v1/models

/v1/models

Enumerating models

Models

The models endpoint gives the listing of all models in the catalogue, as well as current state. This endpoint can be called from either an authenticated or unauthenticated context.

HTTP request

GET https://api.featherless.ai/v1/models

Response body

If successful, the response body will contain data with the following structure:

{
  "data": [
    {
      "id": "vicgalle/Roleplay-Llama-3-8B",
      "name": "vicgalle/Roleplay-Llama-3-8B",
      "model_class": "llama3-8b-8k",
      "context_length": 8192,
      "max_completion_tokens": 4096
    },
	// and more models ..
  ]
}

Use the id field as the model value when calling /v1/chat/completions or /v1/completions.

Model Attributes

Parameter

Type

Description

id

string

use this value when calling a completion endpoint

model_class

string

context_length

integer

Maximum context length (i.e. prompt + completion) supported by the model.

Clients that call a completion endpoint wit this model and a prompt of this length or longer should expect an error

max_completion_tokens

integer

Maximum amount of output that will be generated in a single call.

Calls to completion endpoints with max_tokens value greater than this should expect the value to be clamped to this value.

is_gated

boolean

Will return true if the model is gated on Hugging Face.

Users must have ungated the model in their connected HuggingFace account before a completion call will be accepted for this model on featherless. Otherwise the completion call will return 403 and an application should instruct the user to visit the model in featherless for instructions to clear the gate.

Note this a property of the model. It does not report the currently authenticated user’s relationship to that model.

available_on_current_plan

boolean

returned if and only if the request was authenticated.

If the value is false, expect a 403 if the user attempts to use this model with a completions endpoint.

Query parameters

All query parameters are optional.

Multi-value filters can be passed as comma-separated values or repeated query parameters:

/v1/models?family=llama3,qwen2

/v1/models?family=llama3&family=qwen2

Common parameters

  • q: Search by model name or id.

  • available_on_current_plan: Pass true or 1 to filter to models available on the authenticated user’s current plan.

  • tags: Require all listed tags to match. This is a legacy filter.

  • page: Page number for pagination.

  • per_page: Number of results per page. Maximum is 1000.

Model filters

You can filter by model metadata:

  • license

  • family

  • model_class

  • status

  • languages

  • tasks

  • architectures

  • training

  • capabilities

  • modalities

  • domains

  • creative

  • content_flags

  • parameter_bucket

  • popularity_level

Examples:

/v1/models?capabilities=chat,tool-use

/v1/models?modalities=vision

/v1/models?family=llama3

/v1/models?languages=en,ja

Boolean filters

  • gated: true or false

  • conversational: true or false

Range filters

Use these to filter by context window:

  • context_length_min: Minimum context length in tokens.

  • context_length_max: Maximum context length in tokens.

Example request

List models available on your plan
curl "https://api.featherless.ai/v1/models?available_on_current_plan=true" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $FEATHERLESS_API_KEY"
Find tool-capable chat models with at least 32k context
curl -H "Authorization: Bearer $FEATHERLESS_API_KEY" \
  "https://api.featherless.ai/v1/models?capabilities=chat,tool-use&context_length_min=32768"
Search for Llama models:
curl -H "Authorization: Bearer $FEATHERLESS_API_KEY" \
  "https://api.featherless.ai/v1/models?q=llama"

Paginate through models sorted by popularity
curl -H "Authorization: Bearer $FEATHERLESS_API_KEY" \
  "https://api.featherless.ai/v1/models?sort=-popularity&page=1&per_page=100"

Last edited: May 11, 2026