POST Submit Crawl Job
Submit a multi-page crawl job
POST
https://api.anakin.io/v1/crawlSubmit a website for multi-page crawling. The job discovers URLs and scrapes each page. Use the returned jobId to poll for results.
Request Body
{
"url": "https://example.com",
"maxPages": 10,
"includePatterns": ["/blog/*"],
"excludePatterns": ["/admin/*"],
"country": "us",
"useBrowser": false
}| Parameter | Type | Description |
|---|---|---|
url required | string | The starting URL to crawl from. Must be valid HTTP/HTTPS. |
maxPages | number | Maximum pages to crawl. Default 10, max 100. |
includePatterns | string[] | URL glob patterns to include. Only URLs matching at least one pattern are crawled. |
excludePatterns | string[] | URL glob patterns to exclude. URLs matching any pattern are skipped. |
country | string | Country code for proxy routing. Default "us". See Supported Countries. |
useBrowser | boolean | Use headless Chrome for rendering. Default false. Best for JS-heavy sites. |
sessionId | string | Browser session ID for authenticated crawling. See Browser Sessions. |
Response
202 Accepted{
"jobId": "job_abc123xyz",
"status": "pending"
}The job is processed asynchronously. Use the jobId with GET /v1/crawl/{id} to check status and retrieve results.
Code Examples
curl -X POST https://api.anakin.io/v1/crawl \
-H "X-API-Key: your_api_key" \
-H "Content-Type: application/json" \
-d '{
"url": "https://example.com",
"maxPages": 20,
"includePatterns": ["/blog/*"],
"country": "us"
}'import requests
response = requests.post(
'https://api.anakin.io/v1/crawl',
headers={'X-API-Key': 'your_api_key'},
json={
'url': 'https://example.com',
'maxPages': 20,
'includePatterns': ['/blog/*'],
'excludePatterns': ['/admin/*'],
'country': 'us'
}
)
data = response.json()
print(f"Job submitted: {data['jobId']}")const response = await fetch('https://api.anakin.io/v1/crawl', {
method: 'POST',
headers: {
'X-API-Key': 'your_api_key',
'Content-Type': 'application/json'
},
body: JSON.stringify({
url: 'https://example.com',
maxPages: 20,
includePatterns: ['/blog/*'],
excludePatterns: ['/admin/*'],
country: 'us'
})
});
const data = await response.json();
console.log(data.jobId);