POST Submit Crawl Job

Submit a multi-page crawl job

POSThttps://api.anakin.io/v1/crawl

Submit a website for multi-page crawling. The job discovers URLs and scrapes each page. Use the returned jobId to poll for results.


Request Body

{
  "url": "https://example.com",
  "maxPages": 10,
  "includePatterns": ["/blog/*"],
  "excludePatterns": ["/admin/*"],
  "country": "us",
  "useBrowser": false
}
ParameterTypeDescription
url requiredstringThe starting URL to crawl from. Must be valid HTTP/HTTPS.
maxPagesnumberMaximum pages to crawl. Default 10, max 100.
includePatternsstring[]URL glob patterns to include. Only URLs matching at least one pattern are crawled.
excludePatternsstring[]URL glob patterns to exclude. URLs matching any pattern are skipped.
countrystringCountry code for proxy routing. Default "us". See Supported Countries.
useBrowserbooleanUse headless Chrome for rendering. Default false. Best for JS-heavy sites.
sessionIdstringBrowser session ID for authenticated crawling. See Browser Sessions.

Response

202 Accepted
{
  "jobId": "job_abc123xyz",
  "status": "pending"
}

The job is processed asynchronously. Use the jobId with GET /v1/crawl/{id} to check status and retrieve results.


Code Examples

curl -X POST https://api.anakin.io/v1/crawl \
  -H "X-API-Key: your_api_key" \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://example.com",
    "maxPages": 20,
    "includePatterns": ["/blog/*"],
    "country": "us"
  }'