Skip to content
Scalekit Docs
Talk to an Engineer Dashboard

API rate limits

Understand Scalekit API rate limits, tell them apart from upstream provider limits, handle 429 responses, and request a higher limit.

Scalekit applies a rate limit to the API requests from each environment. When a workload exceeds that limit, the API responds with HTTP 429 Too Many Requests. This page explains how to recognize a rate-limit response, tell a Scalekit limit apart from an upstream provider’s limit, handle the response, and request a higher limit.

Scalekit enforces a per-environment request rate, measured in requests per minute. Scalekit tunes the limit per account, so high-throughput workloads can need a higher limit than the default. Routing MCP tool calls through Scalekit on top of authentication traffic is one example of a workload that can need more headroom.

When you exceed the limit, Scalekit returns HTTP 429 Too Many Requests. Back off and retry with exponential backoff rather than retrying immediately.

Tell Scalekit limits apart from provider limits

Section titled “Tell Scalekit limits apart from provider limits”

When you call tools through Scalekit, a 429 can come from either Scalekit or the upstream provider that Scalekit calls on your behalf, such as a CRM or data API. The error_code field on the error identifies the source:

error_codeSourceWhat to do
RATE_LIMITEDScalekit’s own rate limitReduce the overall request frequency and back off before retrying.
TOOL_ERRORThe upstream provider rate-limited the tool callApply the provider’s recommended backoff. Check the tool call logs for the provider’s message.

Review the detailed error in your dashboard’s tool call logs to confirm which provider and which tool triggered the limit.

Every Scalekit SDK raises a dedicated exception when the API returns 429. Catch it, read the error code to determine the source, and back off before retrying.

rate-limit-handling.ts
import { ScalekitTooManyRequestsException } from '@scalekit-sdk/node';
try {
// Your Scalekit SDK call, for example executing a tool
await scalekit.tools.executeTool(/* ... */);
} catch (error) {
if (error instanceof ScalekitTooManyRequestsException) {
// errorCode identifies the source of the 429 so you can back off correctly
if (error.errorCode === 'TOOL_ERROR') {
// Upstream provider rate-limited the call: apply provider-specific backoff
console.error('Provider rate limit:', error.message);
} else {
// Scalekit's own rate limit (RATE_LIMITED): reduce overall request frequency
console.error('Scalekit rate limit:', error.message);
}
}
}

If your workload needs a higher limit, contact Scalekit support with your account details and your expected peak requests per minute. Plan ahead before you route additional traffic, such as MCP tool calls, through Scalekit. Scalekit reviews the request and adjusts the limit for your account.

When you estimate the limit you need, include headroom above your current peak so that normal spikes do not trigger 429 responses.