Largest AI inference access to 10,000+ open source models

Instantly deploy at scale for fine-tuning, testing, and production with unlimited tokens.

Trusted by AI teams worldwide

Background Light

Explore and access models instantly

We provide inference via API to a continually expanding library of open-weight models, including the most popular models for coding assistance, deep research, creative writing, and more.

Leaderboard
1
2
3
1
2
3
1
2
3

How people use Featherless

OpenHands

OpenHands

OpenHands is an open source AI software development platform to streamline sofware development by automating coding tasks using intelligent agents. Developers can now focus on more complex challenges teaming up with AI supported by Featherless. See how to get started in this guide.

Novelcrafter

Novelcrafter

NovelCrafter is an AI-powered writing platform designed to assist authors throughout the entire novel-writing process, from initial brainstorming to final edits. You can level up your creative writing with any model from Featherless extensive catalog, from ones that are known for poetic prose to specialized ones in dialogue or vast world knowledge.

WyvernChat

WyvernChat

WyvernChat is a user-first AI chat app with sleek UX and consistent content policy. Finding the right model isn't simply a technical choice; it's giving life to your character within unique identity and personality. Featherless has built-in support into WyvernChat so you can make use of our growing catalog of open source models for your favorite characters and creative writing.

LangChain

LangChain

LangChain is one of the most widely adopted libraries that offer developers powerful tools to manage complex prompts and conversational state. With our OpenAI SDK compatibility you can power your applications with Featherless and our catalog of open models. See the docs for LangChain and LiteLLM.

Why Featherless?

Featherless is a serverless inference provider offering advanced model loading and GPU orchestration capabilities. Access our extensive catalog of thousands of models without the burden of server management or operational overhead. Our transparent billing structure is predictable, ensuring no unexpected costs.

ProviderCostSpeedChoice
RunPod(thousands)
HuggingFace(thousands)
Anthropic(<10 models)
OpenRouter(~200 models)
Featherless
Grid Background

Flat pricing with unlimited tokens

Feather Basic

$10.00/month
Feature icon
Access to models up to 15B
Feature icon
Up to 2 concurrent connections
Feature icon
Up to 16K context
Feature icon
Regular speed

Feather Premium

$25.00/month
Feature icon
Access any model - no limit on size!
Feature icon
Up to 4 concurrent connections
Feature icon
Up to 16K context
Feature icon
Regular speed

Feather Scale

$75.00/month
Business plan that can scale to arbitrarily many concurrent connections
Each scale unit allows for:
8 concurrent requests to models less than or equal to 15B, or
4 concurrent requests to models less than or equal to 34B, or
2 concurrent requests to models less than or equal to 72B, or
a linear combination of the above
Private, secure, and anonymous usage - no logs
Concurrency is scaled based on quantity of the selected plan. DeepSeek R1 and V3 currently excluded

How many concurrencies do you need?

2
50
2

For enterprise, you can run your own catalog on us from your cloud with reduced GPU.

See Details.

Frequently Asked Questions

What is Featherless?

Featherless is an LLM hosting provider that offers our subscribers access to a continually expanding library of HuggingFace models.
Featherless: Less hassle, less effort. Start now.

Do you log my chat history?

No. We do not log any of the prompts or completions sent to our API.

Which model architectures are supported?

Our goal is to provide serverless inference for all models on Hugging Face. We currently support a wide range of llama models including Llama 2 and 3, Mistral, Qwen and Deep Seek. For more details see https://featherless.ai/docs/model-compatibility.

How do I get models added?

Business customers can deploy models through their dashboard. Users on individual plans can request either on discord or by emailing [email protected].