GET Get Crawl Result

Poll for crawl job status and retrieve scraped page content

GEThttps://api.anakin.io/v1/crawl/{id}

Retrieve the status and results of a crawl job. Use this to poll for completion after submitting a crawl request.


Path Parameters

ParameterTypeDescription
id requiredstringThe job ID returned from the submit endpoint

Response

200 OK
{
  "id": "job_abc123xyz",
  "status": "completed",
  "url": "https://example.com",
  "totalPages": 3,
  "completedPages": 3,
  "results": [
    {
      "url": "https://example.com",
      "status": "completed",
      "html": "<html>...</html>",
      "markdown": "# Home page content...",
      "durationMs": 2000
    },
    {
      "url": "https://example.com/blog",
      "status": "completed",
      "html": "<html>...</html>",
      "markdown": "# Blog index...",
      "durationMs": 1500
    },
    {
      "url": "https://example.com/blog/post-1",
      "status": "failed",
      "error": "Connection timeout",
      "durationMs": 5000
    }
  ],
  "createdAt": "2024-01-01T12:00:00Z",
  "completedAt": "2024-01-01T12:00:15Z",
  "durationMs": 15000
}

Response Fields

FieldTypeDescription
statusstringpending, processing, completed, or failed
urlstringThe starting URL submitted for crawling
totalPagesnumberTotal pages discovered and attempted
completedPagesnumberPages successfully scraped
resultsarrayPer-page results. Only present when completed.
errorstringError message. Only present when the entire job failed.
durationMsnumberTotal processing time in milliseconds.

Per-Page Result Fields

FieldTypeDescription
urlstringThe page URL
statusstringcompleted or failed
htmlstringRaw HTML content. Only when page completed.
markdownstringMarkdown version of the content. Only when page completed.
errorstringError message. Only when page failed.
durationMsnumberPer-page processing time in milliseconds.

Code Examples

curl -X GET https://api.anakin.io/v1/crawl/job_abc123xyz \
  -H "X-API-Key: your_api_key"

For polling patterns, see the Polling Jobs reference.