New
Announcing Featherless' Realtime API Beta Learn more

Concurrency

Understanding Your Model Access: Simple, Flexible, Powerful Model weights determine your concurrent connections - learn how to maximize your plan's potential.

Concurrency - Plan Comparison

Model Weights

Each model has an assigned "weight" based on its size:

  • Models ≤15B parameters: Weight 1

  • Models 24-32B parameters: Weight 2

  • Models 70B parameters: Weight 4

  • Deepseek-R1 model:

    • Premium tier: Weight 4


Feather Basic

  • Concurrency Units: 1

  • What You Can Run:

    • 1 concurrent request to any model ≤15B

Feather Premium

  • Concurrency Units: 4

  • What You Can Run:

    • 4 concurrent requests to models ≤15B, OR

    • 2 concurrent requests to models ≤34B, OR

    • 1 concurrent request to any model ≥70B, OR

    • Any combination that adds up to 4 weight units

Feather Scale

  • Concurrency Units: 8 per unit

  • What You Can Run:

    • 8 concurrent requests to models ≤15B, OR

    • 4 concurrent requests to models ≤34B, OR

    • 2 concurrent requests to models ≤70B, OR

    • Any combination that adds up to 8 weight units


Examples

Feather Premium Examples

  1. Example 1: Run 4 simultaneous requests to Qwen2.5-7B (≤15B)

  2. Example 2: Run 2 simultaneous requests to Qwen2.5-32B (24-32B)

  3. Example 3: Run 1 simultaneous request to Qwen2.5-72B (≥70B)

  4. Example 4 (Mixed): Run 1 Qwen2.5-32B (weight 2) + 2 Qwen2.5-7B (weight 1 each) = Total weight 4

Feather Scale Examples

  1. Example 1: Run 8 simultaneous requests to Qwen2.5-7B (≤15B)

  2. Example 2: Run 4 simultaneous requests to Qwen2.5-32B (24-32B)

  3. Example 3: Run 2 simultaneous requests to Qwen2.5-72B (≥70B)

  4. Example 4 (Mixed): Run 1 Qwen2.5-72B (weight 4) + 2 Qwen2.5-32B (weight 2 each) = Total weight 8


Linear Combinations

Your total concurrency usage is calculated by adding up the weights of all models you're running simultaneously. For example:

  • 1 model with weight 2 + 2 models with weight 1 = Total weight 4

  • 1 model with weight 4 + 2 models with weight 2 = Total weight 8

Make sure your total concurrency usage doesn't exceed your plan's limit to avoid request failures.