Rate Limits
LVNG enforces rate limits at two levels: per-endpoint limits on HTTP routes and per-plan limits on API key usage. Socket.io connections have their own event-based throttle. Every response includes headers that tell you where you stand.
HTTP API Rate Limits
Each endpoint group has its own rate limit, enforced per authenticated user or per IP address (for public routes). All windows are 1 minute.
Note: The Chat API has two simultaneous limits. A single IP can send up to 100 requests per minute, but each individual user is limited to 30 requests per minute regardless of IP.
API Key Plan-Based Limits
When authenticating with an API key, additional rate limits apply based on the plan tier associated with that key. These are tracked across three time windows.
Need higher limits? Upgrade your plan or contact hello@lvng.ai.
Socket.io Rate Limits
WebSocket connections are rate-limited by event count per socket connection.
These values are configurable server-side via the SOCKET_RATE_LIMIT_WINDOW and SOCKET_RATE_LIMIT_MAX environment variables.
When a socket connection exceeds the limit, the server emits an error event:
{
400">class="text-emerald-400">"code": 400">class="text-emerald-400">"RATE_LIMIT_EXCEEDED",
400">class="text-emerald-400">"message": 400">class="text-emerald-400">"Too many events. Please wait."
}Rate Limit Headers
Every HTTP API response includes rate limit headers so you can track your usage in real time.
Headers
X-RateLimit-LimitintegerMaximum number of requests allowed in the current time window.
X-RateLimit-RemainingintegerNumber of requests remaining in the current window.
X-RateLimit-Resetstring (ISO 8601)Timestamp indicating when the current window resets.
Retry-AfterintegerSeconds to wait before retrying. Only present on 429 responses.
Example Response Headers
HTTP/1.1 200 OK
Content-Type: application/json
X-RateLimit-Limit: 100
X-RateLimit-Remaining: 95
X-RateLimit-Reset: 2026-03-19T14:30:00.000ZHandling 429 Too Many Requests
When you exceed your rate limit, the API responds with HTTP 429. The response body and headers tell you when you can retry.
HTTP/1.1 429 Too Many Requests
Content-Type: application/json
X-RateLimit-Limit: 100
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 2026-03-19T14:31:00.000Z
Retry-After: 60
{
400">class="text-emerald-400">"error": 400">class="text-emerald-400">"Too Many Requests",
400">class="text-emerald-400">"message": 400">class="text-emerald-400">"Rate limit exceeded",
400">class="text-emerald-400">"retryAfter": 60
}Recommended Retry Strategy
- Read the
Retry-Afterheader from the 429 response. - Wait that many seconds before retrying.
- If the retry also fails, apply exponential backoff with jitter.
- Avoid tight retry loops. Continued requests while rate-limited may result in longer cooldown periods.
TypeScript Retry Example
400">async 400">function fetchWithRetry(url: string, options: RequestInit, maxRetries = 3) {
for (400">let attempt = 0; attempt < maxRetries; attempt++) {
400">const response = 400">await fetch(url, options)
400">if (response.status !== 429) {
400">return response
}
400">const retryAfter = response.headers.get(400">class="text-emerald-400">'Retry-After')
400">const delaySeconds = retryAfter ? parseInt(retryAfter, 10) : Math.pow(2, attempt)
400">class="text-zinc-500">// Add jitter to avoid thundering herd
400">const jitter = Math.random() * 1000
400">const delayMs = delaySeconds * 1000 + jitter
console.warn(400">class="text-emerald-400">`Rate limited. Retrying in ${(delayMs / 1000).toFixed(1)}s (attempt ${attempt + 1}/${maxRetries})`)
400">await 400">new Promise((resolve) => setTimeout(resolve, delayMs))
}
400">throw 400">new Error(400">class="text-emerald-400">'Max retries exceeded -- still rate limited.')
}Best Practices
- •Monitor headers proactively. Check
X-RateLimit-Remainingbefore it hits zero and throttle your request rate accordingly. - •Use WebSockets for real-time data. Instead of polling, connect to the LVNG Socket.io server for live updates like messages, presence, and typing indicators. This avoids burning through HTTP rate limits.
- •Cache responses. If data does not change frequently (e.g., agent configs, workspace metadata), cache it client-side to avoid redundant calls.
- •Implement exponential backoff. Never retry at a fixed interval. Increase the delay between retries to let the rate window reset.