Run any Hugging Face model via API
No GPU Setup required




The Fastest way to run Hugging Face models in Production



Access thousands of open-source models from a single API. Every hugging face trending model without setup or hosting.
Copy the Model ID and paste it into the code
You're ready to make your first API call using Featherless
Flat pricing with unlimited tokens
- Access to models up to 15B
- Up to 2 concurrent connections
- Up to 16K context
- Access to DeepSeek, Kimi and GLM
- Access any model - no limit on size!
- Up to 4 concurrent connections
- Up to 32K context
- Access any model up to 229B
- Upto 8 concurrent connections
- Up to 256K context
- 1 agent runtime
- Standard sandbox environment
- Persistent Storage
- Access any model - no limit on size!
- Upto 8 concurrent connections
- Up to 256K context
- 1 agent runtime
- Larger sandbox environment
- Persistent Storage