Getting Started

The Featherless Serverless AI platform.

What is Featherless?

Featherless is a serverless AI inference platform. Our goal is to make all AI models available for serverless inference, and we’ve started with llama-based text generation models (e.g. Llama, Mistral, Qwen).

We provide inference via API to a continually expanding library of open-weight models, including the most popular models for role-playing, creative writing, coding assistance, and more. See here for details on our model catalog and what makes a model compatible.


Quick start

Our API interface is OpenAI compatible, meaning any client program that works with OpenAI as an AI/inference provider can be reconfigured to use featherless with little effort:

  1. Sign up for an account at Featherless.

  2. Get your API key from the dashboard.

  3. Make your first API call:

OpenAI SDK - Python
from openai import OpenAI

client = OpenAI(
  base_url="https://api.featherless.ai/v1",
  api_key="FEATHERLESS_API_KEY",
)

response = client.chat.completions.create(
  model='meta-llama/Meta-Llama-3.1-8B-Instruct',
  messages=[
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "Hello!"}
  ],
)
print(response.model_dump()['choices'][0]['message']['content'])

You can also make direct requests to our API endpoints (most important of which being /completions and /chat/completions) to integrate Featherless into any software application.

Featherless API - Python
import requests

response = requests.post(
    url="https://api.featherless.ai/v1/chat/completions",
    headers={
        "Content-Type": "application/json",
        "Authorization": "Bearer FEATHERLESS_API_KEY"
    },
    json={
        "model": "meta-llama/Meta-Llama-3.1-8B-Instruct",
        "messages": [
            {"role": "system", "content": "You are a helpful assistant."},
            {"role": "user","content": "Hello! How are you?"}
        ]
    }
)
print(response.json()["choices"][0]["message"]["content"])

We have guides for how to use Featherless with the most popular client programs (e.g. SillyTavern, Typing mind, Aider) as well as more in depth documentation on our API endpoints.

Why Choose Featherless?

Featherless is a serverless provider with unique model loading and GPU orchestration abilities that allows us to keep an exceptionally large catalog of models online.

Other providers either offer low cost of access (e.g. openrouter, AWS bedrock) but with a limited set of models, or an unlimited range of models (e.g. runpod) but with users managing servers and the associated costs of operation (e.g. > $2/hour for sufficient GPUs to run a 70B model).

Featherless provides the best of both worlds offering unmatched model range and variety but with serverless pricing.

Provider

Cost

Speed

Choice

runpod

✅ (thousands)

hugging face inference

✅ (thousands)

anthropic

❌ (<10 models)

openrouter

❌ (~200 models)

Featherless

✅ (thousands)

Plans


Featherless provides serverless access to models, eliminating the need to manage infrastructure.

Our plans are subscription and concurrency based. Allowing unlimited monthly requests with a fixed number of concurrent requests. A paid subscription is able to access all models up to a given size.

Featherless offers two consumer plans:

  • Featherless Basic ($10/month):

    • Access to all models up to 15B parameters

    • Unlimited monthly usage (2 concurrent requests)

    • Private, secure, and anonymous usage - no logs

  • Featherless Premium ($25/month)

    • All the benefits of Featherless Basic

    • Access to all models up to 72B parameters

And one scalable business plan:

  • Featherless Scale ($75 per scale unit/month):

    • All the benefits of Featherless Premium

    • Per unit 2x Premium models or 6x Basic models

    • Host your own private models from Hugging Face*

*Scale customers can also run inference against private models from a connected hugging face account, provided the model is one of the compatible architectures.

Privacy and Logging

Featherless does not log chats, prompts, or completions. For more details, see complete privacy policy here.

If you have any questions or need further assistance, please join the Featherless AI Discord.

Welcome to the world of Featherless AI, where you can use more models with ease!