Error Responses

Complete reference for HTTP status codes, error codes, retry strategy, and troubleshooting

Every AnakinScraper endpoint returns errors in a consistent JSON shape. This page is the canonical reference — what we return, what each code means, when to retry, and how to diagnose the most common failures.

For per-endpoint rate limits and the recommended request pacing, see Rate Limits.

Error response format

All errors return a JSON body with two fields:

{
  "error": "rate_limit_exceeded",
  "message": "Too many requests. Please try again later."
}

Field	Type	Description
`error`	string	A short, machine-readable code. Stable across releases — safe to switch on in code.
`message`	string	A human-readable explanation. Subject to copy changes — log it but don't parse it.

Async jobs are different. When a polled job ends in status: "failed", the failure shape lives inside the job response (error field, free-form string). See Async job failures below.

HTTP status codes

Status	Meaning	Retry?
`200`	Success — synchronous endpoint returned a result, or polled job is complete.	—
`202`	Accepted — async job was queued, or polled job is still in progress.	—
`400`	Bad Request — your request body or parameters are invalid.	No — fix the request.
`401`	Unauthorized — API key missing, malformed, or revoked.	No — check the key.
`402`	Payment Required — insufficient credits for the operation.	No — top up credits.
`403`	Forbidden — the resource exists but you don't own it.	No.
`404`	Not Found — the job ID, session, or scraper doesn't exist.	No.
`409`	Conflict — resource is locked or already exists (e.g., session in use, duplicate name).	After resolving the conflict.
`422`	Unprocessable Entity — request was valid but the resource isn't ready (e.g., session not yet saved).	After the prerequisite is met.
`429`	Too Many Requests — rate limit exceeded.	Yes — see Rate limit handling.
`500`	Internal Server Error — unexpected failure on our side.	Yes — exponential backoff.
`502`	Bad Gateway — a downstream service (browser, CDP proxy) is unreachable.	Yes — exponential backoff.
`503`	Service Unavailable — a required service is temporarily down or unconfigured.	Yes — wait 30–60s.

Error code catalog

The error field uses a fixed set of codes. The table below covers every code returned by the public API.

Validation & input

Code	HTTP	When you'll see it	What to do
`invalid_request`	400	Body is not valid JSON, a required field is missing, or a value is out of range (e.g., `depth > 5`, batch with 0 or >10 URLs, prompt >8KB, schema >50KB).	Inspect `message` for the specific field, fix the request, and resubmit.
`invalid_url`	400	A URL in a batch request is malformed.	Fix the URL. The `message` indicates the index.
`invalid_job_type`	400	The `job_type` field on `POST /v1/request` doesn't match a registered handler.	Use a supported value (`url_scraper`, `crawl`, `map`, `agentic_search`, `search`, `web_scraper`).

Auth & authorization

Code	HTTP	When you'll see it	What to do
`unauthorized`	401	No API key was sent, or the key is malformed, revoked, or belongs to a deleted user.	Send a valid key in `X-API-Key` (or one of the accepted header variants). Generate a new key in the dashboard if needed.
`forbidden`	403	The resource (job, session, scraper) exists but belongs to a different user.	Use a job ID from your own account.

Credits

Code	HTTP	When you'll see it	What to do
`insufficient_credits`	402	Account balance is below the cost of the operation. The `message` includes the cost and your current balance.	Top up credits in Billing or upgrade your plan.

Rate limiting

Code	HTTP	When you'll see it	What to do
`rate_limit_exceeded`	429	You exceeded the per-endpoint rate limit.	See Rate limit handling.

Resource state

Code	HTTP	When you'll see it	What to do
`not_found`	404	Job ID, session ID, or scraper ID doesn't exist.	Verify the ID. Job IDs are valid for 30 days.
`session_not_saved`	422	You tried to attach a saved browser session before its storage state was uploaded.	Run the manual save flow first (see Browser Sessions).
`session_in_use`	409	A saved session is already attached to an active automation.	Wait for the other run to finish, or use a different session.
`duplicate_name`	409	A session name is already taken for this user.	Use a unique name.

Server-side

Code	HTTP	When you'll see it	What to do
`server_error`	500	Unhandled error in our handler — usually a database or internal service issue.	Retry with backoff. If it persists, contact support with the request ID.
`queue_error`	500	Failed to enqueue the job (SQS unavailable or misconfigured).	Retry with backoff.
`configuration_error`	500	A required service-side config is missing for this endpoint.	Retry; if persistent, contact support.
`internal_error`	500	Generic catch-all for unexpected failures.	Retry with backoff.
`search_error`	500	Upstream search provider (Perplexity) returned an error.	Retry with backoff; reword the prompt if persistent.
`service_unavailable`	503	A dependent service (browser AI, CDP proxy, scraper generator) is offline.	Retry after 30–60 seconds.

Note on format consistency. A small number of older endpoints — /v1/browser-connect, /v1/ai/evaluate, and a few scraper-management routes — currently return errors using slight variations of the format above (e.g., omitting message, or using a Fiber default shape {"statusCode": 400, "message": "..."} for validation errors). Treat them as still conforming to the principle: a string error field is always present, and the HTTP status is authoritative.

Accepted API key headers

The API accepts the key under any of the following headers (and a few query params for WebSocket endpoints), in priority order:

X-API-Key, X-Api-Key, Api-Key, API-Key, X-Access-Key, Access-Key, apikey, api_key, Authorization (with Bearer , API-Key , ApiKey prefix or raw).

For /v1/browser-connect (WebSocket): ?api_key=, ?apikey=, or ?token= query parameters also work.

Async job failures

For async endpoints (/v1/url-scraper, /v1/agentic-search, /v1/map, /v1/crawl, Wire's /v1/holocron/task), HTTP status 200/202 only confirms that polling is working. The actual outcome lives in the status field of the job response:

`status`	Meaning
`pending`	Queued, not yet picked up by a worker.
`processing`	A worker is actively running the job.
`completed`	Finished successfully — results are in the response.
`failed`	The job ran but could not produce a result. See `error`.

A failed job response looks like this:

{
  "id": "job-abc123",
  "status": "failed",
  "error": "Blocked by website (HTTP 403)",
  "createdAt": "2025-04-30T18:12:04Z",
  "completedAt": "2025-04-30T18:12:34Z",
  "durationMs": 30000
}

The error field is a free-form, human-readable string. Common substrings to switch on if you must:

Substring	Cause	Suggested fix
`Blocked by website`, `HTTP 403`, `HTTP 429`, `bot detection`, `CAPTCHA`	Anti-bot protection.	Set `useBrowser: true` and/or specify a `country`. Try a browser session for sites that require login.
`Connection timeout`, `timeout`	Page didn't finish loading in time.	For SPAs, set `useBrowser: true` and increase wait time.
`DNS resolution failed`, `no such host`	The domain can't be resolved.	Verify the URL is reachable.
`TLS`, `SSL`	Certificate validation failure.	Confirm the target uses a trusted certificate.
`Invalid URL`	Malformed URL passed all the way through.	Pre-validate URLs client-side.

Batch jobs. A batch URL scraper job is completed if any child finishes — partial failures don't fail the parent. Iterate results[] and check each child's status and error.

Retry guidance

When to retry

Status	Retry?	Why
`400`, `401`, `402`, `403`, `404`, `409`, `422`	No	The request itself is the problem. Retrying will return the same error.
`429`	Yes	Transient — the bucket refills. Read `Retry-After` if present, otherwise back off.
`500`, `502`, `503`	Yes	Transient server-side issue. Cap retries at 3–5 and use exponential backoff with jitter.
Network errors (no response)	Yes	Treat the same as `5xx`.

Recommended pattern: exponential backoff with jitter

Jitter spreads retries from many clients so a thundering herd doesn't synchronize. Cap the total wait so a stuck worker fails fast instead of looping forever.

import random
import time
import requests

RETRYABLE = {429, 500, 502, 503}
MAX_ATTEMPTS = 5
BASE_DELAY = 1.0  # seconds
MAX_DELAY = 30.0

def request_with_retry(method, url, *, headers=None, json=None):
    """POST/GET with exponential backoff + jitter on retryable failures."""
    for attempt in range(MAX_ATTEMPTS):
        try:
            response = requests.request(method, url, headers=headers, json=json, timeout=30)
        except requests.RequestException:
            if attempt == MAX_ATTEMPTS - 1:
                raise
            time.sleep(_backoff(attempt))
            continue

        if response.status_code not in RETRYABLE:
            return response

        # Honor server-sent Retry-After when present
        retry_after = response.headers.get("Retry-After")
        delay = float(retry_after) if retry_after else _backoff(attempt)

        if attempt == MAX_ATTEMPTS - 1:
            return response  # caller decides what to do

        time.sleep(delay)

    return response


def _backoff(attempt: int) -> float:
    """Full-jitter exponential backoff, capped at MAX_DELAY."""
    cap = min(MAX_DELAY, BASE_DELAY * (2 ** attempt))
    return random.uniform(0, cap)


# Usage
resp = request_with_retry(
    "POST",
    "https://api.anakin.io/v1/url-scraper",
    headers={"X-API-Key": "ak-your-key-here"},
    json={"url": "https://example.com"},
)
resp.raise_for_status()
print(resp.json()["jobId"])

const RETRYABLE = new Set([429, 500, 502, 503]);
const MAX_ATTEMPTS = 5;
const BASE_DELAY = 1000;   // ms
const MAX_DELAY = 30_000;

async function requestWithRetry(url, init = {}) {
  for (let attempt = 0; attempt < MAX_ATTEMPTS; attempt++) {
    let response;
    try {
      response = await fetch(url, init);
    } catch (err) {
      if (attempt === MAX_ATTEMPTS - 1) throw err;
      await sleep(backoff(attempt));
      continue;
    }

    if (!RETRYABLE.has(response.status)) return response;

    // Honor server-sent Retry-After when present
    const retryAfter = response.headers.get("Retry-After");
    const delay = retryAfter ? Number(retryAfter) * 1000 : backoff(attempt);

    if (attempt === MAX_ATTEMPTS - 1) return response;
    await sleep(delay);
  }
}

function backoff(attempt) {
  const cap = Math.min(MAX_DELAY, BASE_DELAY * 2 ** attempt);
  return Math.random() * cap;  // full jitter
}

const sleep = (ms) => new Promise((r) => setTimeout(r, ms));

// Usage
const res = await requestWithRetry("https://api.anakin.io/v1/url-scraper", {
  method: "POST",
  headers: {
    "X-API-Key": "ak-your-key-here",
    "Content-Type": "application/json",
  },
  body: JSON.stringify({ url: "https://example.com" }),
});
const { jobId } = await res.json();
console.log(jobId);

Rate limit handling

Per-endpoint limits are documented on the Rate Limits page. The short version: most submit endpoints allow 60 requests/min per user; AI evaluation is 10/min; GET polling endpoints are not rate-limited.

Response headers

When a request is rate-limited, the API returns 429 Too Many Requests with the standard error body. Some 429 responses (notably /v1/browser-connect over the limit on concurrent CDP sessions) include a Retry-After header indicating seconds to wait:

HTTP/1.1 429 Too Many Requests
Retry-After: 5
Content-Type: application/json

{"error": "rate_limit_exceeded", "message": "Too many requests. Please try again later."}

Heads up. AnakinScraper does not currently emit the optional X-RateLimit-Limit, X-RateLimit-Remaining, or X-RateLimit-Reset headers. Don't rely on them — drive your retry loop off Retry-After (when present) or your own backoff. Surfacing these headers is on our roadmap.

Reading `Retry-After`

The requestWithRetry helpers above already honor Retry-After. If you only need to handle 429 specifically:

import time
import requests

resp = requests.post(
    "https://api.anakin.io/v1/url-scraper",
    headers={"X-API-Key": "ak-your-key-here"},
    json={"url": "https://example.com"},
)

if resp.status_code == 429:
    wait = int(resp.headers.get("Retry-After", "5"))
    time.sleep(wait)
    resp = requests.post(...)  # retry

let res = await fetch("https://api.anakin.io/v1/url-scraper", { method: "POST", ... });

if (res.status === 429) {
  const wait = Number(res.headers.get("Retry-After") ?? 5);
  await new Promise((r) => setTimeout(r, wait * 1000));
  res = await fetch(...);  // retry
}

Troubleshooting

The following scenarios cover the failures we see most often in support tickets.

"I'm getting 403 from the target site"

The site has bot detection. Two levers, in order of effectiveness:

Switch to the browser handler. Add "useBrowser": true to your request — this routes through Camoufox (Firefox-based, fingerprint-masked) instead of plain HTTP.
Set a country. Add "country": "US" (or another ISO code) — the proxy bandit will pick a residential IP from that region.

If both fail, the site likely requires a logged-in session. Use Browser Sessions to capture cookies once, then attach the session by ID.

"Timeouts on a single-page app"

Plain HTTP can't run JavaScript. Set "useBrowser": true so the scraper executes the page's JS before extracting content. For very slow SPAs, also increase "waitForSelector" or "waitMs" if your endpoint supports them.

"Schema extraction returns the wrong fields"

Agentic Search and JSON extraction are LLM-driven — better prompts and tighter schemas produce better output:

Be explicit in the prompt: name each field and describe its expected shape (e.g., "extract price as a number in USD, no currency symbol").
Provide examples in the prompt for ambiguous fields.
Tighten the schema. Required JSON Schema fields force the model to produce them; optional fields tend to get omitted.
Cap the schema at 50KB. Larger schemas are rejected with invalid_request.

"Job stuck in pending"

A few possibilities:

You're polling the wrong endpoint. POST /v1/url-scraper returns a jobId you poll at GET /v1/url-scraper/{id}. The list is in Polling Jobs.
Worker fleet is saturated. Pending → processing usually takes <5s. If it's been >60s, retry or contact support.
The job died silently. Stale jobs are auto-marked failed after 1 hour. If you see this, check the error field for the cause.

"402 insufficient_credits when I just topped up"

Credits are deducted on completion, but checked upfront. If you submitted a batch of 10 URLs and have 8 credits, the batch is rejected immediately even though some URLs would have come from cache (which costs 0). Top up enough for the worst case.

"Got a 401 with a brand-new key"

API keys take a few seconds to propagate. If a freshly-created key returns 401, wait 5–10 seconds and retry. If it persists, regenerate the key in the dashboard.

"Different services return slightly different error shapes"

A small number of older endpoints (notably /v1/browser-connect, /v1/ai/evaluate, and some scraper-management routes) use minor variations on the canonical format. The HTTP status is always authoritative; the body always contains a string error field. Plan your error handling around the status code first, then the error code.

"Wire job returned a `429` even though I've only made a few requests"

GET /v1/holocron/jobs/{id} is capped at 60/min per user — unlike URL Scraper, which is unlimited. If you're polling many Wire jobs in parallel, stagger them or reduce poll frequency. See Rate Limits for the per-endpoint table.

"Browser Connect closed unexpectedly"

The CDP proxy returns 429 once a single API instance has 50 concurrent CDP sessions. Pool clients across multiple instances, close sessions when done, and retry on Retry-After. If you saved a session, it must finish uploading to S3 before another connection can attach to it (otherwise you'll see session_not_saved).

Reporting unexpected errors

If you hit a 500, an unfamiliar error code, or behavior that contradicts this page:

Capture the request: method, URL, headers (redact the API key), body.
Capture the response: status, headers, body.
Note the time (UTC, to the second) — this lets us correlate against server logs.
Email support@anakin.io with the above. For Enterprise customers, see your dedicated channel.