Instant, unlimited hosting for any llama model on HuggingFace.

Over 4300+ compatible models to choose from. Starting from $10/month. No server needed.

Hero Image

Our Models

We provide inference via API to a continually expanding library of open-weight models, including the most popular models for role-playing, creative writing, coding assistance, and more.

Leaderboard
1
2
3
4
1
2
3
4
1
2
3
4

Ways to Use Featherless

Coding

Coding

Agents

Agents

Chat & Roleplay

Chat & Roleplay

Assistants

Assistants

Creative Writing

Creative Writing

Custom Applications

Custom Applications

Why Featherless?

Featherless is a serverless AI inference provider with unique model loading and GPU orchestration abilities that makes an exceptionally large catalog of models available for users. Other providers either offer low cost of access (e.g. openrouter, AWS bedrock) but with a limited set of models, or an unlimited range of models (e.g. runpod) but with users managing servers and the associated costs of operation (e.g. > $2/hour to run a 70B).

Featherless provides the best of both worlds offering unmatched model range and variety but with serverless pricing.

ProviderCostSpeedChoice
RunPod(thousands)
HuggingFace(thousands)
Anthropic(<10 models)
OpenRouter(~200 models)
Featherless
Grid Background

Simple Pricing + Unlimited Tokens

Feather Basic

$10.00/month
Feature icon
Access to models up to 15B
Feature icon
Up to 2 concurrent connections
Feature icon
Up to 16K context
Feature icon
Regular speed

Feather Premium

$25.00/month
Feature icon
Access any model - no limit on size!
Feature icon
Up to 4 concurrent connections
Feature icon
Up to 16K context
Feature icon
Regular speed

Frequently Asked Questions

What is Featherless?

Featherless is an LLM hosting provider that offers our subscribers access to a continually expanding library of HuggingFace models.
Featherless: Less hassle, less effort. Start now.

Do you log my chat history?

No. We do not log any of the prompts or completions sent to our API.

Which model architectures are supported?

Our goal is to provide serverless inference for all models on Hugging Face. We currently support a wide range of llama models including Llama 2 and 3, Mistral, Qwen and Deep Seek. For more details see https://featherless.ai/docs/model-compatibility.

How do I get models added?

Business customers can deploy models through their dashboard. Users on individual plans can request either on discord or by emailing support@featherless.ai.