One API key. Instant access.

Freedom to reliably deploy any open model effortlessly.

Get API Key Browse 40,000+ models

Discover your perfect model

Test any of 40,000+ models effortlessly.

Drag

Top RP & Creative Writing

862B

Deepseek 4

deepseek-ai/DeepSeek-V4-Pro

Flagship RP & writing

2.8M

Top RP & Creative Writing

685B

Deepseek 3.2

deepseek-ai/DeepSeek-V3.2

Immersive roleplay

2.6M

Top RP & Creative Writing

Top RP & Creative Writing

12B

Mistral Nemo

Nitral-AI/Captain-Eris_Violet-V0.420-12B

Roleplay character chat

100

Top RP & Creative Writing

12B

Mistral Nemo

FallenMerick/MN-Violet-Lotus-12B

Roleplay storytelling

456

Top RP & Creative Writing

70B

Llama 3.1

TheDrummer/Anubis-70B-v1.2

Long-form roleplay

Top RP & Creative Writing

70B

Llama 3.3

ReadyArt/L3.3-The-Omega-Directive-70B-Unslop-v2.1

Unfiltered roleplay

198

Top RP & Creative Writing

70B

Llama 3.3

Doctor-Shotgun/L3.3-70B-Magnum-Diamond

Prose-focused writing

Top RP & Creative Writing

31B

Gemma 4

google/gemma-4-31B-it

Creative writing & chat

7.7M

Top RP & Creative Writing

12B

Mistral Nemo

TheDrummer/Rocinante-12B-v1.1

Roleplay & storytelling

deepseek-ai/DeepSeek-V4-Pro

Agentic coding flagship

Qwen/Qwen3-Coder-30B-A3B-Instruct

Repo-scale coding agent

Qwen/Qwen3-Coder-Next

deepseek-ai/DeepSeek-V4-Flash

arcee-ai/Trinity-Large-Thinking

huihui-ai/Qwen2.5-72B-Instruct-abliterated

tachyphylaxis/Meta-Llama-3.1-Chat-Uncensored

Qwen/Qwen2.5-72B-Instruct

Reasoning & agents MoE

deepseek-ai/DeepSeek-V3.2

Open-weight reasoning

Qwen/Qwen3-Next-80B-A3B-Instruct

Qwen/Qwen3-VL-8B-Instruct

Vision-language model

Open-weight reasoning

deepseek-ai/DeepSeek-V3-0324

mistralai/Mistral-Small-3.2-24B-Instruct-2506

Nanbeige/Nanbeige4.1-3B

Qwen/Qwen2.5-7B-Instruct

google/gemma-4-26B-A4B-it

meta-llama/Llama-3.1-8B-Instruct

deepseek-ai/DeepSeek-R1-0528

deepseek-ai/DeepSeek-R1-Distill-Qwen-32B

deepseek-ai/DeepSeek-R1-Distill-Llama-8B

Qwen/Qwen2.5-Math-7B-Instruct

Qwen/Qwen2.5-Math-72B-Instruct

NousResearch/Hermes-4-14B

agentica-org/DeepScaleR-1.5B-Preview

deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B

NousResearch/DeepHermes-3-Llama-3-8B-Preview

Qwen/Qwen2.5-0.5B-Instruct

Ultra-compact instruct

meta-llama/Llama-3.2-1B-Instruct

Compact Gemma instruct

1.6M

Top Small Models

TinyLlama

TinyLlama/TinyLlama-1.1B-Chat-v1.0

Qwen/Qwen2.5-1.5B-Instruct

Qwen/Qwen2-1.5B-Instruct

Qwen/Qwen2.5-Coder-1.5B-Instruct

google/gemma-4-E2B-it

EssentialAI/eai-distill-0.5b

Small distilled model

1.4M

Top Language Specific

Llama 2

elyza/ELYZA-japanese-Llama-2-7b-instruct

Japanese chat model

3,359

Top Language Specific

7.6B

Qwen 2

openthaigpt/openthaigpt1.5-7b-instruct

Thai chat model

4,387

Top Language Specific

Llama 3

MLP-KTLim/llama-3-Korean-Bllossom-8B

Korean-English chat

388

3,888

Top Language Specific

Llama 3

MohamedRashad/Arabic-Orpo-Llama-3-8B-Instruct

Arabic chat model

3,953

Top Language Specific

Mistral

FredrikBL/SwedishBeagle-dare

Swedish NLP model

Top Language Specific

Llama 3

IlyaGusev/saiga_llama3_8b

Russian chat model

137

398k

Top Language Specific

Gemma 3

CraneAILabs/swahili-gemma-1b

Swahili translation

337

Top Language Specific

Gemma

ytu-ce-cosmos/Turkish-Gemma-9b-T1

Turkish chat model

178

774

Top Language Specific

Qwen 2

SeaLLMs/SeaLLMs-v3-1.5B-Chat

SE Asian multilingual

View the model library

One API. Every model. No surprises.

Any model, zero friction

Built for reliability and performance

Costs you can actually predict

Access thousands of open-source models from a single API. Every hugging face trending model without setup or hosting.

Low latency, dependable uptime, and architecture designed for real workloads.

Flat-rate pricing designed for scale with unlimited tokens.

Predictable pricing that scales with you

Chat

Built for interactive chat with unlimited tokens.

$25/month

Context size up to 32K
4 concurrent units

Billed monthly. Cancel anytime.

developer

Build production Al with the fastest usage-based inference.

$50/month

Context size up to 256K
1 agent environment included
Fastest response times
Unused credits roll over
Billed per token

Billed monthly. Cancel anytime.

business

Dedicated GPUs and the team to run them.

Custom

Dedicated H100, MI325, B200 & B300 GPUs
Engineering team included
Gets cheaper over time with fine-tuning
Burst & failover to Public Cloud

Talk to an Engineer

Annual contracts. Volume pricing.

THE FUTURE OF AI IS OPEN

Built by researchers contributing to RWKV a Linux Foundation project.

Thank you! Your submission has been received!

Oops! Something went wrong while submitting the form.

Model Releases

Jul 29, 2026

Kimi K3 is live on Featherless

Benchmarks

Jul 24, 2026

Best Uncensored AI Models in 2026: Run These Top Open LLMs Today

Jul 16, 2026

GLM 5.2 vs Opus for Agent Workloads (Reduce cost from $150K to $7.6K with AMD GPUs)

Benchmarks

Jun 18, 2026

Official GLM-5.2 coding benchmarks.

AI you can count on.
From prototype to production.

Get started

One API key. Instant access.

Test any of 40,000+ models effortlessly.

One API. Every model. No surprises.

Predictable pricing that scales with you

Built by researchers contributing to RWKV a Linux Foundation project.

Kimi K3 is live on Featherless

Best Uncensored AI Models in 2026: Run These Top Open LLMs Today

GLM 5.2 vs Opus for Agent Workloads (Reduce cost from $150K to $7.6K with AMD GPUs)

Official GLM-5.2 coding benchmarks.

AI you can count on. From prototype to production.

AI you can count on.
From prototype to production.