Rate Limits
The API enforces per-key and per-organization request limits to keep the service responsive. Every response includes rate limit headers so your integration always knows where it stands.
Request rate limits
Limits are enforced in a 60-second fixed window, applied both per API key and per organization. If either limit is reached, the request receives a 429 response.
| Route type | Per key | Per organization | Applies to |
|---|---|---|---|
| Read | 120 / min | 360 / min | jobs, credits |
| Create | 60 / min | 180 / min | classify |
| Scan | 20 / min | 60 / min | scan, scan/lite, scan/deep (same numeric policy per path; each path has its own burst bucket) |
These are default launch limits. If you need higher throughput for a production workload, contact support@on-page.ai.
Per-org job concurrency
Counts apply to the combined totalacross classify and scan jobs — one org can't saturate the worker pool with any single job type.
5
Concurrent active jobs per org
Up to 5 jobs can run simultaneously for your organization, across any combination of classify / scan. Additional submissions queue and start as slots free up. Hit the cap and you get a 429 with ORG_ACTIVE_LIMIT_REACHED.
100
Max queued jobs per org
If your organization already has 100 jobs waiting, new submissions are rejected with ORG_QUEUED_LIMIT_REACHED until the queue drains.
What happens under heavy load
When the global scan queue is saturated, new requests may receive a 429 with error code SCAN_QUEUE_SATURATED. The Retry-After header tells you how long to wait. This is rare — it only occurs when the entire system is under exceptional load, not just your organization.
Response headers
X-RateLimit-* appears on every authenticated response so you can track your remaining budget in real time. Retry-After and X-Concurrency-* are conditional — present only on 429 responses (see the per-header notes below).
X-RateLimit-LimitMaximum requests allowed in the current window.
X-RateLimit-RemainingRequests remaining before the limit resets.
X-RateLimit-ResetUnix timestamp (seconds) when the current window resets.
Retry-AfterSeconds to wait before retrying. Present on 429 responses when the limiter computed one.
X-Concurrency-LimitThe concurrency cap that was hit. Present on 429 responses with ORG_ACTIVE_LIMIT_REACHED or ORG_QUEUED_LIMIT_REACHED.
X-Concurrency-RemainingRemaining in-flight capacity — always 0 when the caller just hit the cap. Retry after jobs drain (observe via webhooks or GET /v1/jobs/:id polling).
Handling 429 responses
A 429 means you've hit a limit. Here's what to do:
Always respect Retry-After
The header tells you exactly how long to wait. Don't guess — use the value.
Use idempotency keys on create routes
Pass an Idempotency-Key header so retries are safe and don't create duplicate jobs.
Back off on repeated 429s
If you see multiple 429s in a row, reduce your request rate rather than retrying immediately.
Poll existing jobs instead of resubmitting
If a scan was already submitted, poll its status with GET /v1/jobs/:id instead of submitting a new one.
Scans can take 30 seconds to 3 minutes
Don't treat slow scans as failures. Poll at reasonable intervals (every 3-5 seconds) and let the system work.
Rate limit error codes
| Code | Meaning |
|---|---|
RATE_LIMITED | Request rate limit exceeded for this key or organization. |
ORG_ACTIVE_LIMIT_REACHED | Your org has 5 jobs running at once. Wait for one to finish before submitting more. |
ORG_QUEUED_LIMIT_REACHED | Your org has 100 jobs waiting in queue. Wait for the queue to drain. |
SCAN_QUEUE_SATURATED | Global scan capacity is full. Retry after the indicated delay. |