Rate Limits

LVNG enforces rate limits at two levels: per-endpoint limits on HTTP routes and per-plan limits on API key usage. Socket.io connections have their own event-based throttle. Every response includes headers that tell you where you stand.

HTTP API Rate Limits

Each endpoint group has its own rate limit, enforced per authenticated user or per IP address (for public routes). All windows are 1 minute.

Endpoint GroupRateScope

Messages API60 req/minPer user

Channels API300 req/minPer user

Chat API100 req/minPer IP

Chat API30 req/minPer user

General v2 APIs100 req/minPer user

Artifacts API120 req/minPer user

Twin Workspace API60 req/minPer user

Voice Notes API100 req/minPer user

Public endpoints60 req/minPer IP

Note: The Chat API has two simultaneous limits. A single IP can send up to 100 requests per minute, but each individual user is limited to 30 requests per minute regardless of IP.

API Key Plan-Based Limits

When authenticating with an API key, additional rate limits apply based on the plan tier associated with that key. These are tracked across three time windows.

PlanPer MinutePer HourPer Day

Free520100

Pro10050010,000

EnterpriseUnlimitedUnlimitedUnlimited

Need higher limits? Upgrade your plan or contact hello@lvng.ai.

Socket.io Rate Limits

WebSocket connections are rate-limited by event count per socket connection.

ParameterDefault

Max events per window100

Window duration60 seconds

These values are configurable server-side via the SOCKET_RATE_LIMIT_WINDOW and SOCKET_RATE_LIMIT_MAX environment variables.

When a socket connection exceeds the limit, the server emits an error event:

Socket.io error event

{
  "code": "RATE_LIMIT_EXCEEDED",
  "message": "Too many events. Please wait."
}

Rate Limit Headers

Every HTTP API response includes rate limit headers so you can track your usage in real time.

Headers

X-RateLimit-Limitinteger

Maximum number of requests allowed in the current time window.

X-RateLimit-Remaininginteger

Number of requests remaining in the current window.

X-RateLimit-Resetstring (ISO 8601)

Timestamp indicating when the current window resets.

Retry-Afterinteger

Seconds to wait before retrying. Only present on 429 responses.

Example Response Headers

Response Headers

HTTP/1.1 200 OK
Content-Type: application/json
X-RateLimit-Limit: 100
X-RateLimit-Remaining: 95
X-RateLimit-Reset: 2026-03-19T14:30:00.000Z

Handling 429 Too Many Requests

When you exceed your rate limit, the API responds with HTTP 429. The response body and headers tell you when you can retry.

429 Response

HTTP/1.1 429 Too Many Requests
Content-Type: application/json
X-RateLimit-Limit: 100
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 2026-03-19T14:31:00.000Z
Retry-After: 60

{
  "error": "Too Many Requests",
  "message": "Rate limit exceeded",
  "retryAfter": 60
}

Recommended Retry Strategy

Read the Retry-After header from the 429 response.
Wait that many seconds before retrying.
If the retry also fails, apply exponential backoff with jitter.
Avoid tight retry loops. Continued requests while rate-limited may result in longer cooldown periods.

TypeScript Retry Example

retry-helper.ts

async function fetchWithRetry(url: string, options: RequestInit, maxRetries = 3) {
  for (let attempt = 0; attempt < maxRetries; attempt++) {
    const response = await fetch(url, options)

    if (response.status !== 429) {
      return response
    }

    const retryAfter = response.headers.get('Retry-After')
    const delaySeconds = retryAfter ? parseInt(retryAfter, 10) : Math.pow(2, attempt)

    // Add jitter to avoid thundering herd
    const jitter = Math.random() * 1000
    const delayMs = delaySeconds * 1000 + jitter

    console.warn(`Rate limited. Retrying in ${(delayMs / 1000).toFixed(1)}s (attempt ${attempt + 1}/${maxRetries})`)

    await new Promise((resolve) => setTimeout(resolve, delayMs))
  }

  throw new Error('Max retries exceeded -- still rate limited.')
}

Best Practices

•Monitor headers proactively. Check X-RateLimit-Remaining before it hits zero and throttle your request rate accordingly.
•Use WebSockets for real-time data. Instead of polling, connect to the LVNG Socket.io server for live updates like messages, presence, and typing indicators. This avoids burning through HTTP rate limits.
•Cache responses. If data does not change frequently (e.g., agent configs, workspace metadata), cache it client-side to avoid redundant calls.
•Implement exponential backoff. Never retry at a fixed interval. Increase the delay between retries to let the rate window reset.