Capacity and retries
When to use this page
Use this page to understand throttling, stable sessions, queueing, and retry behavior.
Dynamic capacity
QuotaFlow estimates available capacity continuously and protects customers from overload with bounded queueing and clear errors.
Stable sessions
QuotaFlow uses a stable session id to maintain continuity for related calls. If temporary capacity changes, clients should retry with the same session id.
Temporary errors
Retry these errors with exponential backoff and jitter:
429rate limit or usage limit503temporary service unavailable- Network timeouts
Do not retry authentication or permission errors until the key or request is fixed.
Model contract
QuotaFlow does not silently change unsupported model ids into different models. If a model is not enabled for your key, fix the model id or key permissions.
AI agents: start at
/llms.txt, fetch /llms-full.txt for full context, and parse /openapi.yaml for endpoint schemas.