Error Responses
Complete reference for HTTP status codes, error codes, retry strategy, and troubleshooting
Every AnakinScraper endpoint returns errors in a consistent JSON shape. This page is the canonical reference — what we return, what each code means, when to retry, and how to diagnose the most common failures.
For per-endpoint rate limits and the recommended request pacing, see Rate Limits.
Error response format
All errors return a JSON body with two fields:
{
"error": "rate_limit_exceeded",
"message": "Too many requests. Please try again later."
}| Field | Type | Description |
|---|---|---|
error | string | A short, machine-readable code. Stable across releases — safe to switch on in code. |
message | string | A human-readable explanation. Subject to copy changes — log it but don't parse it. |
Async jobs are different. When a polled job ends in
status: "failed", the failure shape lives inside the job response (errorfield, free-form string). See Async job failures below.
HTTP status codes
| Status | Meaning | Retry? |
|---|---|---|
200 | Success — synchronous endpoint returned a result, or polled job is complete. | — |
202 | Accepted — async job was queued, or polled job is still in progress. | — |
400 | Bad Request — your request body or parameters are invalid. | No — fix the request. |
401 | Unauthorized — API key missing, malformed, or revoked. | No — check the key. |
402 | Payment Required — insufficient credits for the operation. | No — top up credits. |
403 | Forbidden — the resource exists but you don't own it. | No. |
404 | Not Found — the job ID, session, or scraper doesn't exist. | No. |
409 | Conflict — resource is locked or already exists (e.g., session in use, duplicate name). | After resolving the conflict. |
422 | Unprocessable Entity — request was valid but the resource isn't ready (e.g., session not yet saved). | After the prerequisite is met. |
429 | Too Many Requests — rate limit exceeded. | Yes — see Rate limit handling. |
500 | Internal Server Error — unexpected failure on our side. | Yes — exponential backoff. |
502 | Bad Gateway — a downstream service (browser, CDP proxy) is unreachable. | Yes — exponential backoff. |
503 | Service Unavailable — a required service is temporarily down or unconfigured. | Yes — wait 30–60s. |
Error code catalog
The error field uses a fixed set of codes. The table below covers every code returned by the public API.
Validation & input
| Code | HTTP | When you'll see it | What to do |
|---|---|---|---|
invalid_request | 400 | Body is not valid JSON, a required field is missing, or a value is out of range (e.g., depth > 5, batch with 0 or >10 URLs, prompt >8KB, schema >50KB). | Inspect message for the specific field, fix the request, and resubmit. |
invalid_url | 400 | A URL in a batch request is malformed. | Fix the URL. The message indicates the index. |
invalid_job_type | 400 | The job_type field on POST /v1/request doesn't match a registered handler. | Use a supported value (url_scraper, crawl, map, agentic_search, search, web_scraper). |
Auth & authorization
| Code | HTTP | When you'll see it | What to do |
|---|---|---|---|
unauthorized | 401 | No API key was sent, or the key is malformed, revoked, or belongs to a deleted user. | Send a valid key in X-API-Key (or one of the accepted header variants). Generate a new key in the dashboard if needed. |
forbidden | 403 | The resource (job, session, scraper) exists but belongs to a different user. | Use a job ID from your own account. |
Credits
| Code | HTTP | When you'll see it | What to do |
|---|---|---|---|
insufficient_credits | 402 | Account balance is below the cost of the operation. The message includes the cost and your current balance. | Top up credits in Billing or upgrade your plan. |
Rate limiting
| Code | HTTP | When you'll see it | What to do |
|---|---|---|---|
rate_limit_exceeded | 429 | You exceeded the per-endpoint rate limit. | See Rate limit handling. |
Resource state
| Code | HTTP | When you'll see it | What to do |
|---|---|---|---|
not_found | 404 | Job ID, session ID, or scraper ID doesn't exist. | Verify the ID. Job IDs are valid for 30 days. |
session_not_saved | 422 | You tried to attach a saved browser session before its storage state was uploaded. | Run the manual save flow first (see Browser Sessions). |
session_in_use | 409 | A saved session is already attached to an active automation. | Wait for the other run to finish, or use a different session. |
duplicate_name | 409 | A session name is already taken for this user. | Use a unique name. |
Server-side
| Code | HTTP | When you'll see it | What to do |
|---|---|---|---|
server_error | 500 | Unhandled error in our handler — usually a database or internal service issue. | Retry with backoff. If it persists, contact support with the request ID. |
queue_error | 500 | Failed to enqueue the job (SQS unavailable or misconfigured). | Retry with backoff. |
configuration_error | 500 | A required service-side config is missing for this endpoint. | Retry; if persistent, contact support. |
internal_error | 500 | Generic catch-all for unexpected failures. | Retry with backoff. |
search_error | 500 | Upstream search provider (Perplexity) returned an error. | Retry with backoff; reword the prompt if persistent. |
service_unavailable | 503 | A dependent service (browser AI, CDP proxy, scraper generator) is offline. | Retry after 30–60 seconds. |
Note on format consistency. A small number of older endpoints —
/v1/browser-connect,/v1/ai/evaluate, and a few scraper-management routes — currently return errors using slight variations of the format above (e.g., omittingmessage, or using a Fiber default shape{"statusCode": 400, "message": "..."}for validation errors). Treat them as still conforming to the principle: a stringerrorfield is always present, and the HTTP status is authoritative.
Accepted API key headers
The API accepts the key under any of the following headers (and a few query params for WebSocket endpoints), in priority order:
X-API-Key, X-Api-Key, Api-Key, API-Key, X-Access-Key, Access-Key, apikey, api_key, Authorization (with Bearer , API-Key , ApiKey prefix or raw).
For /v1/browser-connect (WebSocket): ?api_key=, ?apikey=, or ?token= query parameters also work.
Async job failures
For async endpoints (/v1/url-scraper, /v1/agentic-search, /v1/map, /v1/crawl, Wire's /v1/holocron/task), HTTP status 200/202 only confirms that polling is working. The actual outcome lives in the status field of the job response:
status | Meaning |
|---|---|
pending | Queued, not yet picked up by a worker. |
processing | A worker is actively running the job. |
completed | Finished successfully — results are in the response. |
failed | The job ran but could not produce a result. See error. |
A failed job response looks like this:
{
"id": "job-abc123",
"status": "failed",
"error": "Blocked by website (HTTP 403)",
"createdAt": "2025-04-30T18:12:04Z",
"completedAt": "2025-04-30T18:12:34Z",
"durationMs": 30000
}The error field is a free-form, human-readable string. Common substrings to switch on if you must:
| Substring | Cause | Suggested fix |
|---|---|---|
Blocked by website, HTTP 403, HTTP 429, bot detection, CAPTCHA | Anti-bot protection. | Set useBrowser: true and/or specify a country. Try a browser session for sites that require login. |
Connection timeout, timeout | Page didn't finish loading in time. | For SPAs, set useBrowser: true and increase wait time. |
DNS resolution failed, no such host | The domain can't be resolved. | Verify the URL is reachable. |
TLS, SSL | Certificate validation failure. | Confirm the target uses a trusted certificate. |
Invalid URL | Malformed URL passed all the way through. | Pre-validate URLs client-side. |
Batch jobs. A batch URL scraper job is
completedif any child finishes — partial failures don't fail the parent. Iterateresults[]and check each child'sstatusanderror.
Retry guidance
When to retry
| Status | Retry? | Why |
|---|---|---|
400, 401, 402, 403, 404, 409, 422 | No | The request itself is the problem. Retrying will return the same error. |
429 | Yes | Transient — the bucket refills. Read Retry-After if present, otherwise back off. |
500, 502, 503 | Yes | Transient server-side issue. Cap retries at 3–5 and use exponential backoff with jitter. |
| Network errors (no response) | Yes | Treat the same as 5xx. |
Recommended pattern: exponential backoff with jitter
Jitter spreads retries from many clients so a thundering herd doesn't synchronize. Cap the total wait so a stuck worker fails fast instead of looping forever.
import random
import time
import requests
RETRYABLE = {429, 500, 502, 503}
MAX_ATTEMPTS = 5
BASE_DELAY = 1.0 # seconds
MAX_DELAY = 30.0
def request_with_retry(method, url, *, headers=None, json=None):
"""POST/GET with exponential backoff + jitter on retryable failures."""
for attempt in range(MAX_ATTEMPTS):
try:
response = requests.request(method, url, headers=headers, json=json, timeout=30)
except requests.RequestException:
if attempt == MAX_ATTEMPTS - 1:
raise
time.sleep(_backoff(attempt))
continue
if response.status_code not in RETRYABLE:
return response
# Honor server-sent Retry-After when present
retry_after = response.headers.get("Retry-After")
delay = float(retry_after) if retry_after else _backoff(attempt)
if attempt == MAX_ATTEMPTS - 1:
return response # caller decides what to do
time.sleep(delay)
return response
def _backoff(attempt: int) -> float:
"""Full-jitter exponential backoff, capped at MAX_DELAY."""
cap = min(MAX_DELAY, BASE_DELAY * (2 ** attempt))
return random.uniform(0, cap)
# Usage
resp = request_with_retry(
"POST",
"https://api.anakin.io/v1/url-scraper",
headers={"X-API-Key": "ak-your-key-here"},
json={"url": "https://example.com"},
)
resp.raise_for_status()
print(resp.json()["jobId"])const RETRYABLE = new Set([429, 500, 502, 503]);
const MAX_ATTEMPTS = 5;
const BASE_DELAY = 1000; // ms
const MAX_DELAY = 30_000;
async function requestWithRetry(url, init = {}) {
for (let attempt = 0; attempt < MAX_ATTEMPTS; attempt++) {
let response;
try {
response = await fetch(url, init);
} catch (err) {
if (attempt === MAX_ATTEMPTS - 1) throw err;
await sleep(backoff(attempt));
continue;
}
if (!RETRYABLE.has(response.status)) return response;
// Honor server-sent Retry-After when present
const retryAfter = response.headers.get("Retry-After");
const delay = retryAfter ? Number(retryAfter) * 1000 : backoff(attempt);
if (attempt === MAX_ATTEMPTS - 1) return response;
await sleep(delay);
}
}
function backoff(attempt) {
const cap = Math.min(MAX_DELAY, BASE_DELAY * 2 ** attempt);
return Math.random() * cap; // full jitter
}
const sleep = (ms) => new Promise((r) => setTimeout(r, ms));
// Usage
const res = await requestWithRetry("https://api.anakin.io/v1/url-scraper", {
method: "POST",
headers: {
"X-API-Key": "ak-your-key-here",
"Content-Type": "application/json",
},
body: JSON.stringify({ url: "https://example.com" }),
});
const { jobId } = await res.json();
console.log(jobId);Rate limit handling
Per-endpoint limits are documented on the Rate Limits page. The short version: most submit endpoints allow 60 requests/min per user; AI evaluation is 10/min; GET polling endpoints are not rate-limited.
Response headers
When a request is rate-limited, the API returns 429 Too Many Requests with the standard error body. Some 429 responses (notably /v1/browser-connect over the limit on concurrent CDP sessions) include a Retry-After header indicating seconds to wait:
HTTP/1.1 429 Too Many Requests
Retry-After: 5
Content-Type: application/json
{"error": "rate_limit_exceeded", "message": "Too many requests. Please try again later."}Heads up. AnakinScraper does not currently emit the optional
X-RateLimit-Limit,X-RateLimit-Remaining, orX-RateLimit-Resetheaders. Don't rely on them — drive your retry loop offRetry-After(when present) or your own backoff. Surfacing these headers is on our roadmap.
Reading Retry-After
The requestWithRetry helpers above already honor Retry-After. If you only need to handle 429 specifically:
import time
import requests
resp = requests.post(
"https://api.anakin.io/v1/url-scraper",
headers={"X-API-Key": "ak-your-key-here"},
json={"url": "https://example.com"},
)
if resp.status_code == 429:
wait = int(resp.headers.get("Retry-After", "5"))
time.sleep(wait)
resp = requests.post(...) # retrylet res = await fetch("https://api.anakin.io/v1/url-scraper", { method: "POST", ... });
if (res.status === 429) {
const wait = Number(res.headers.get("Retry-After") ?? 5);
await new Promise((r) => setTimeout(r, wait * 1000));
res = await fetch(...); // retry
}Troubleshooting
The following scenarios cover the failures we see most often in support tickets.
"I'm getting 403 from the target site"
The site has bot detection. Two levers, in order of effectiveness:
- Switch to the browser handler. Add
"useBrowser": trueto your request — this routes through Camoufox (Firefox-based, fingerprint-masked) instead of plain HTTP. - Set a country. Add
"country": "US"(or another ISO code) — the proxy bandit will pick a residential IP from that region.
If both fail, the site likely requires a logged-in session. Use Browser Sessions to capture cookies once, then attach the session by ID.
"Timeouts on a single-page app"
Plain HTTP can't run JavaScript. Set "useBrowser": true so the scraper executes the page's JS before extracting content. For very slow SPAs, also increase "waitForSelector" or "waitMs" if your endpoint supports them.
"Schema extraction returns the wrong fields"
Agentic Search and JSON extraction are LLM-driven — better prompts and tighter schemas produce better output:
- Be explicit in the prompt: name each field and describe its expected shape (e.g., "extract
priceas a number in USD, no currency symbol"). - Provide examples in the prompt for ambiguous fields.
- Tighten the schema. Required JSON Schema fields force the model to produce them; optional fields tend to get omitted.
- Cap the schema at 50KB. Larger schemas are rejected with
invalid_request.
"Job stuck in pending"
A few possibilities:
- You're polling the wrong endpoint.
POST /v1/url-scraperreturns ajobIdyou poll atGET /v1/url-scraper/{id}. The list is in Polling Jobs. - Worker fleet is saturated. Pending → processing usually takes <5s. If it's been >60s, retry or contact support.
- The job died silently. Stale jobs are auto-marked
failedafter 1 hour. If you see this, check theerrorfield for the cause.
"402 insufficient_credits when I just topped up"
Credits are deducted on completion, but checked upfront. If you submitted a batch of 10 URLs and have 8 credits, the batch is rejected immediately even though some URLs would have come from cache (which costs 0). Top up enough for the worst case.
"Got a 401 with a brand-new key"
API keys take a few seconds to propagate. If a freshly-created key returns 401, wait 5–10 seconds and retry. If it persists, regenerate the key in the dashboard.
"Different services return slightly different error shapes"
A small number of older endpoints (notably /v1/browser-connect, /v1/ai/evaluate, and some scraper-management routes) use minor variations on the canonical format. The HTTP status is always authoritative; the body always contains a string error field. Plan your error handling around the status code first, then the error code.
"Wire job returned a 429 even though I've only made a few requests"
GET /v1/holocron/jobs/{id} is capped at 60/min per user — unlike URL Scraper, which is unlimited. If you're polling many Wire jobs in parallel, stagger them or reduce poll frequency. See Rate Limits for the per-endpoint table.
"Browser Connect closed unexpectedly"
The CDP proxy returns 429 once a single API instance has 50 concurrent CDP sessions. Pool clients across multiple instances, close sessions when done, and retry on Retry-After. If you saved a session, it must finish uploading to S3 before another connection can attach to it (otherwise you'll see session_not_saved).
Reporting unexpected errors
If you hit a 500, an unfamiliar error code, or behavior that contradicts this page:
- Capture the request: method, URL, headers (redact the API key), body.
- Capture the response: status, headers, body.
- Note the time (UTC, to the second) — this lets us correlate against server logs.
- Email support@anakin.io with the above. For Enterprise customers, see your dedicated channel.