Docs /Getting Started/Plans

Plans

Explaining how our different subscription tiers work.

Featherless provides serverless access to models as well agent runtimes, enabling you to power AI-applications without needing to manage infrastructure.

Our plans are subscription and concurrent unit based. Allowing unlimited monthly requests with a fixed number of concurrent units. Subscription tiers differ by the model size limits, context sizes, and concurrent units offered.

Featherless offers consumer plans with smaller plans for interactive chat for assistants and role-playing, and two larger plans for agentic inference and coding plans.

Consumer Plans

*Plan*	*Tier*	*Price (/month)*	*Features*
Featherless Premium	Chat	$25	Private, secure, and anonymous usage (no logs) Access *any* model in the catalogue (including Kimi & GLM) 4 concurrent units*
Featherless Token-Based Business	Agentic	$25+	Private, secure, and anonymous usage (no logs) Credit-based monthly billing that scales with usage 8 concurrent units* Context size up to 256K One agent sandbox
Featherless GPU-Based Business	Agentic	Custom	Private, secure, and anonymous usage (no logs) Access *any* model in the catalogue (including DeepSeek, Kimi & GLM) 8 concurrent units* Context size up to 256K One agent sandbox

*smaller models allow for higher concurrency than larger models. See more below.

Business Plans

Business plans are scalable, allowing users to purchase larger amounts of inference to power production applications - whether agent fleets or other AI applications.

*Plan*	*Price (/unit/month)*	*Features*
Feather Token-Based API	$25+	Private, secure, and anonymous usage (no logs) Credit-based monthly billing with prepaid tiers 8 concurrent units* Context size up to 256K One sandbox per unit
Feather GPU-Based	Custom	Private, secure, and anonymous usage (no logs) Access *any* model in the catalogue (including DeepSeek, Kimi & GLM) 8 concurrent units* Context size up to 256K One sandbox per unit
Feather Per-Request	$25+	Credit-based API access with monthly prepaid credits Pay per successful request based on model price and token usage No model size limit 100 concurrent units See Request Pricing and Credits for billing details

*For more info on how the concurrent unit limits work visit:

Concurrent Unit Limits

Explaining how subscription tiers translate to concurrent inference call maximums.

Last edited: May 7, 2026