Finetuning.aiFinetuning.ai

Rate Limits

API rate limits, response headers, and retry strategies

The Finetuning.ai API enforces rate limits to ensure fair usage and service stability.

Limits

  • 60 requests per minute per user
  • 5 active API keys per account

Rate limit headers

Every API response includes headers showing your current rate limit status:

HeaderDescription
X-RateLimit-LimitMax requests allowed per window
X-RateLimit-RemainingRequests remaining in current window
Retry-AfterSeconds to wait (only on 429 responses)

Handling rate limits

When you exceed the rate limit, the API returns 429 Too Many Requests:

{
  "error": {
    "code": "RATE_LIMITED",
    "message": "Too many requests — wait and retry"
  }
}

Retry strategy

Use exponential backoff with jitter:

async function fetchWithRetry(url, options, maxRetries = 3) {
  for (let attempt = 0; attempt < maxRetries; attempt++) {
    const response = await fetch(url, options);

    if (response.status !== 429) return response;

    const retryAfter = response.headers.get('Retry-After');
    const delay = retryAfter
      ? parseInt(retryAfter) * 1000
      : Math.min(1000 * Math.pow(2, attempt) + Math.random() * 1000, 30000);

    await new Promise(r => setTimeout(r, delay));
  }

  throw new Error('Max retries exceeded');
}

Best practices

  • Cache responses — Don't re-fetch data that hasn't changed
  • Use polling intervals — When checking generation status, poll every 2–5 seconds, not continuously
  • Batch requests — If listing generations, use larger limit values instead of many small requests
  • Handle 429s gracefully — Always implement retry logic with backoff

On this page