Embeddings
A brief guide to embedding support in Featherless
Featherless supports OpenAI-compatible embeddings requests for models that produce embedding outputs. You can use embeddings for search, similarity, clustering, classification, and retrieval pipelines.
Important Notes
Embeddings are available through POST https://api.featherless.ai/v1/embeddings.
The input field can be either a single string or an array of strings. If you send multiple strings in one request, Featherless returns one embedding per input in the same order.
To find supported models, use the model catalog and look for models with embedding support:
https://featherless.ai/models?modalities=embedding
Quick Start
The simplest way to generate embeddings is through the OpenAI Python SDK pointed at Featherless.
from openai import OpenAI
client = OpenAI(
api_key="your-featherless-api-key",
base_url="https://api.featherless.ai/v1"
)
response = client.embeddings.create(
model="Qwen/Qwen3-Embedding-8B",
input="Featherless makes serverless inference simple."
)
print(response.data[0].embedding[:5])
print(response.usage)Batch Inputs
You can also send multiple strings in one request and receive one embedding per item.
from openai import OpenAI
client = OpenAI(
api_key="your-featherless-api-key",
base_url="https://api.featherless.ai/v1"
)
response = client.embeddings.create(
model="Qwen/Qwen3-Embedding-8B",
input=[
"The cat sat on the mat.",
"A kitten is resting on a rug.",
"Server racks need better airflow."
]
)
for item in response.data:
print(item.index, len(item.embedding))Request Options
Featherless supports the standard OpenAI-style embeddings request shape.
model: The embedding model to use.
input: A string or array of strings to embed.
encoding_format: Optional. Usually float.
dimensions: Optional. Use only if your chosen model supports it.
Response Format
A successful response returns a list of embeddings and usage information.
{
"object": "list",
"model": "Qwen/Qwen3-Embedding-8B",
"data": [
{
"object": "embedding",
"index": 0,
"embedding": [0.0123, -0.0456, 0.0789]
}
],
"usage": {
"prompt_tokens": 8,
"total_tokens": 8
}
}Choosing A Model
Embedding models are separate from chat models. In Featherless, supported embedding models are surfaced as models whose output modality is embedding.
If you want a quick way to inspect support, start in the model catalog: