GET Get Crawl Result
Poll for crawl job status and retrieve scraped page content
GET
https://api.anakin.io/v1/crawl/{id}Retrieve the status and results of a crawl job. Use this to poll for completion after submitting a crawl request.
Path Parameters
| Parameter | Type | Description |
|---|---|---|
id required | string | The job ID returned from the submit endpoint |
Response
200 OK{
"id": "job_abc123xyz",
"status": "completed",
"url": "https://example.com",
"totalPages": 3,
"completedPages": 3,
"results": [
{
"url": "https://example.com",
"status": "completed",
"html": "<html>...</html>",
"markdown": "# Home page content...",
"durationMs": 2000
},
{
"url": "https://example.com/blog",
"status": "completed",
"html": "<html>...</html>",
"markdown": "# Blog index...",
"durationMs": 1500
},
{
"url": "https://example.com/blog/post-1",
"status": "failed",
"error": "Connection timeout",
"durationMs": 5000
}
],
"createdAt": "2024-01-01T12:00:00Z",
"completedAt": "2024-01-01T12:00:15Z",
"durationMs": 15000
}Response Fields
| Field | Type | Description |
|---|---|---|
status | string | pending, processing, completed, or failed |
url | string | The starting URL submitted for crawling |
totalPages | number | Total pages discovered and attempted |
completedPages | number | Pages successfully scraped |
results | array | Per-page results. Only present when completed. |
error | string | Error message. Only present when the entire job failed. |
durationMs | number | Total processing time in milliseconds. |
Per-Page Result Fields
| Field | Type | Description |
|---|---|---|
url | string | The page URL |
status | string | completed or failed |
html | string | Raw HTML content. Only when page completed. |
markdown | string | Markdown version of the content. Only when page completed. |
error | string | Error message. Only when page failed. |
durationMs | number | Per-page processing time in milliseconds. |
Code Examples
curl -X GET https://api.anakin.io/v1/crawl/job_abc123xyz \
-H "X-API-Key: your_api_key"import requests
import time
job_id = "job_abc123xyz"
while True:
result = requests.get(
f'https://api.anakin.io/v1/crawl/{job_id}',
headers={'X-API-Key': 'your_api_key'}
)
data = result.json()
if data['status'] == 'completed':
print(f"Crawled {data['completedPages']}/{data['totalPages']} pages:")
for page in data['results']:
if page['status'] == 'completed':
print(f" {page['url']} — {len(page['markdown'])} chars")
else:
print(f" {page['url']} — FAILED: {page['error']}")
break
elif data['status'] == 'failed':
print(f"Error: {data['error']}")
break
time.sleep(2)const jobId = 'job_abc123xyz';
const poll = async () => {
const res = await fetch(`https://api.anakin.io/v1/crawl/${jobId}`, {
headers: { 'X-API-Key': 'your_api_key' }
});
const data = await res.json();
if (data.status === 'completed') {
console.log(`Crawled ${data.completedPages}/${data.totalPages} pages:`);
data.results.forEach(page => {
if (page.status === 'completed') {
console.log(` ${page.url} — ${page.markdown.length} chars`);
} else {
console.log(` ${page.url} — FAILED: ${page.error}`);
}
});
} else if (data.status === 'failed') {
console.error(data.error);
} else {
setTimeout(poll, 2000);
}
};
poll();For polling patterns, see the Polling Jobs reference.