Getting Started
The Featherless Serverless AI platform.
What is Featherless?
Featherless is a serverless AI inference platform. Our goal is to make all AI models available for serverless inference, and we’ve started with llama-based text generation models (e.g. Llama, Mistral, Qwen).
We provide inference via API to a continually expanding library of open-weight models, including the most popular models for role-playing, creative writing, coding assistance, and more. See here for details on our model catalog and what makes a model compatible.
Quick start
Our API interface is OpenAI compatible, meaning any client program that works with OpenAI as an AI/inference provider can be reconfigured to use featherless with little effort:
Sign up for an account at Featherless.
Get your API key from the dashboard.
Make your first API call:
from openai import OpenAI
client = OpenAI(
base_url="https://api.featherless.ai/v1",
api_key="FEATHERLESS_API_KEY",
)
response = client.chat.completions.create(
model='meta-llama/Meta-Llama-3.1-8B-Instruct',
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Hello!"}
],
)
print(response.model_dump()['choices'][0]['message']['content'])
You can also make direct requests to our API endpoints (most important of which being /completions and /chat/completions) to integrate Featherless into any software application.
import requests
response = requests.post(
url="https://api.featherless.ai/v1/chat/completions",
headers={
"Content-Type": "application/json",
"Authorization": "Bearer FEATHERLESS_API_KEY"
},
json={
"model": "meta-llama/Meta-Llama-3.1-8B-Instruct",
"messages": [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user","content": "Hello! How are you?"}
]
}
)
print(response.json()["choices"][0]["message"]["content"])
We have guides for how to use Featherless with the most popular client programs (e.g. SillyTavern, Typing mind, Aider) as well as more in depth documentation on our API endpoints.
Why Choose Featherless?
Featherless is a serverless provider with unique model loading and GPU orchestration abilities that allows us to keep an exceptionally large catalog of models online.
Other providers either offer low cost of access (e.g. openrouter, AWS bedrock) but with a limited set of models, or an unlimited range of models (e.g. runpod) but with users managing servers and the associated costs of operation (e.g. > $2/hour for sufficient GPUs to run a 70B model).
Featherless provides the best of both worlds offering unmatched model range and variety but with serverless pricing.
Provider | Cost | Speed | Choice |
runpod | ❌ | ✅ | ✅ (thousands) |
hugging face inference | ❌ | ✅ | ✅ (thousands) |
anthropic | ✅ | ✅ | ❌ (<10 models) |
openrouter | ✅ | ✅ | ❌ (~200 models) |
Featherless | ✅ | ✅ | ✅ (thousands) |
Plans
Featherless provides serverless access to models, eliminating the need to manage infrastructure.
Our plans are subscription and concurrency based. Allowing unlimited monthly requests with a fixed number of concurrent requests. A paid subscription is able to access all models up to a given size.
Featherless offers two consumer plans:
Featherless Basic ($10/month):
Access to all models up to 15B parameters
Unlimited monthly usage (2 concurrent requests)
Private, secure, and anonymous usage - no logs
Featherless Premium ($25/month)
All the benefits of Featherless Basic
Access to all models up to 72B parameters
And one scalable business plan:
Featherless Scale ($75 per scale unit/month):
All the benefits of Featherless Premium
Per unit 2x Premium models or 6x Basic models
Host your own private models from Hugging Face*
*Scale customers can also run inference against private models from a connected hugging face account, provided the model is one of the compatible architectures.
Privacy and Logging
Featherless does not log chats, prompts, or completions. For more details, see complete privacy policy here.
If you have any questions or need further assistance, please join the Featherless AI Discord.
Welcome to the world of Featherless AI, where you can use more models with ease!