Rate limits
Gyrence rate-limits per workspace, not per key or per IP. Every key on a workspace draws from the same bucket, the same way they draw from the same credit pool.
The goals are:
- Keep one workspace's traffic from degrading another's.
- Surface runaway loops early (a misconfigured agent making thousands of calls per minute).
- Stay out of the way of normal application traffic.
The hard deadline
Independent of any rate limit, every request has a 25-second hard deadline. If the work doesn't complete in 25 seconds — slow origin, stuck headless browser, traverse fanning out further than expected — the request returns:
{ "ok": false, "error": "request exceeded 25s deadline", "code": "timeout" }with HTTP 408. Timeouts are not billed. The deadline applies end-to-end across all tiers and retries Gyrence does internally.
For gyres, the deadline is the most common reason a call returns fewer pages than maxPages — the budget was generous, but the clock ran out first.
When you're rate-limited
Exceeding the per-workspace rate returns:
{ "ok": false, "error": "rate limit exceeded", "code": "rate_limited" }with HTTP 429. Rate-limited requests are not billed.
The current production limits are deliberately quiet — the platform is built around credit metering, not request-count throttling — but they exist as a circuit breaker. If you're seeing 429 in normal operation, you're almost certainly in a runaway loop; check Console → Activity to see what's firing.
Backoff strategy
When you see 429, 502 upstream_error, 503 unavailable, or 408 timeout, back off and retry. The right shape is exponential backoff with full jitter:
async function withRetry<T>(fn: () => Promise<T>, max = 4): Promise<T> {
for (let attempt = 0; attempt < max; attempt++) {
try {
return await fn();
} catch (err) {
if (!isRetryable(err) || attempt === max - 1) throw err;
const base = Math.min(1000 * 2 ** attempt, 8000); // 1s, 2s, 4s, 8s cap
const delay = Math.random() * base; // full jitter
await new Promise((r) => setTimeout(r, delay));
}
}
throw new Error("unreachable");
}
function isRetryable(err: unknown): boolean {
const code = (err as { code?: string }).code;
return code === "rate_limited" || code === "upstream_error"
|| code === "unavailable" || code === "timeout";
}Three guidelines:
- Cap the base. Don't let the delay grow unboundedly — 8 seconds is plenty for a 25-second-deadline API.
- Jitter, always. Without jitter, a fleet of clients backing off in lockstep will pile back on the platform together and trigger the same
429. - Limit total attempts. Three to four retries is the sweet spot. Beyond that you're typically masking a real problem.
Do not retry 400 bad_request, 401 unauthorized, 402 credits_exhausted, 403 forbidden_url, or 404 not_found. These are caller-side or logical errors — retrying changes nothing and just delays the failure surface.
Concurrency, not just throughput
A 25-second deadline means that a workspace running 10 concurrent fetches can sustain ~24 requests/minute per concurrency slot. The throughput ceiling is usually concurrency × turnaround, not the rate-limit number.
For high-volume workloads:
- Bound your own concurrency. A worker pool with a hard cap (e.g. 20 in-flight requests) is more predictable than firing everything at once and reacting to
429. - Prefer batching primitives. A single
/gyrewithmaxPages: 50is one HTTP connection and one billed call (per the gyre schedule); 50 individual/fetchcalls is 50 round-trips and 50 chances to hit the deadline. - Use
/mapwhen you only need URLs. It's a single 1-credit call and almost never times out.
Idempotency
All /api/v1/* endpoints are idempotent — retrying the same request produces the same effect on Gyrence's side (no state change beyond the usage event). Safe to retry on transient failures without extra deduplication.
The one thing retries don't reverse is what was billed. If a request succeeded on attempt 1 but the response was lost in the network, the retry on attempt 2 will succeed (and bill) again. In practice this is rare — server-side timeouts are very different from client-side connection failures — but worth knowing.
MCP traffic at /api/mcp shares the same per-workspace rate-limit bucket and the same 25-second deadline as HTTP. Picking MCP doesn't relax (or tighten) any limit.
Raising your limits
If your workload genuinely needs more than the default rate (and the credit pool to back it), contact us. Limit increases are tied to plan, not pay-as-you-go.
Running out of credits returns 402 credits_exhausted — your workspace can't make any billable call, regardless of rate. Running into the rate limit returns 429 rate_limited — your workspace is making calls faster than allowed, even if the credit balance is healthy. Different problems, different remedies.
