Complete API reference for AnakinScraper. Learn how to integrate powerful web scraping into your application.
Get started with AnakinScraper in minutes. Follow these simple steps:
Sign up for an account
Create your free account by signing up here
Get your API key
Navigate to your dashboard and in Api keys section you can generate a new API key.
Make your first API call
Use the X-API-Key header with your key to authenticate requests (see examples below)
Authentication
All API requests require authentication using the X-API-Key header. Include your API key in every request.
POST https://api.anakin.io/v1/requestRequest Headers:
X-API-Key: your_api_key
Content-Type: application/jsonRequest Body (Single URL):
{
"url": "https://example.com",
"country": "us",
"useBrowser": false,
"generateJson": false,
"forceFresh": false,
"job_type": "url_scraper"
}Response (202 Accepted):
{
"jobId": "job_abc123xyz",
"status": "pending"
}The job is processed asynchronously. Use the jobId to check status and retrieve results.
POST https://api.anakin.io/v1/requestRequest Body (Batch Mode):
{
"job_type": "batch_url_scraper",
"urls": [
"https://example.com/page1",
"https://example.com/page2",
"https://example.com/page3"
],
"country": "us",
"useBrowser": false,
"generateJson": false
}Batch mode allows you to scrape up to 10 URLs in a single request. All URLs are processed in parallel, and you receive a parent job ID to track overall progress.
GET https://api.anakin.io/v1/request/{id}Request Headers:
X-API-Key: your_api_keyResponse (200 OK when completed) - Single URL:
{
"id": "job_abc123xyz",
"status": "completed",
"url": "https://example.com",
"jobType": "url_scraper",
"country": "us",
"html": "<html>...</html>",
"markdown": "# Page content...",
"generatedJson": {"data": {...}},
"cached": false,
"error": null,
"createdAt": "2024-01-01T12:00:00Z",
"completedAt": "2024-01-01T12:00:05Z",
"durationMs": 5000
}Response (200 OK when completed) - Batch Job:
{
"id": "batch_abc123",
"status": "completed",
"jobType": "batch_url_scraper",
"country": "us",
"urls": ["https://example.com/page1", "https://example.com/page2"],
"results": [
{
"index": 0,
"url": "https://example.com/page1",
"status": "completed",
"html": "<html>...</html>",
"markdown": "# Content...",
"generatedJson": {"data": {...}},
"cached": false,
"durationMs": 3000
},
{
"index": 1,
"url": "https://example.com/page2",
"status": "failed",
"error": "Connection timeout",
"durationMs": 5000
}
],
"createdAt": "2024-01-01T12:00:00Z",
"completedAt": "2024-01-01T12:00:10Z",
"durationMs": 10000
}Response Fields:
status: Job status - pending, processing, completed, or failed
html: Raw HTML content (only present when status is completed)
markdown: Markdown version of the content (optional, when available)
generatedJson: AI-extracted structured JSON data (only present when generateJson: true)
cached: Whether the result was served from cache (true means no credits consumed)
error: Error message (only present when status is failed)
durationMs: Time taken to complete the scrape in milliseconds
results: For batch jobs, array of results for each URL (only when all URLs are finished)
GET https://api.anakin.io/v1/history?limit=50&offset=0Request Headers:
X-API-Key: your_api_keyQuery Parameters:
limit: Number of jobs to return (1-1000, default: 50)
offset: Pagination offset (default: 0)
Returns an array of jobs with batch aggregation. Child jobs are excluded from the top-level list and aggregated under their parent batch job.
GET https://api.anakin.io/v1/countriesReturns a list of all 207 supported country codes for proxy routing.
Response (200 OK):
[
{ "name": "United States", "code": "us" },
{ "name": "United Kingdom", "code": "uk" },
{ "name": "Germany", "code": "de" },
{ "name": "Japan", "code": "jp" },
...
]POST https://api.anakin.io/v1/searchPerform AI-powered web searches using Perplexity API. Returns search results with citations, snippets, and relevant content.
Request Headers:
X-API-Key: your_api_key
Content-Type: application/jsonRequest Body:
{
"prompt": "latest AI developments 2024",
"limit": 5
}Response (200 OK):
{
"jobId": "search_abc123xyz",
"status": "completed",
"result": {
"query": "latest AI developments 2024",
"answer": "Summary of search results...",
"results": [
{
"url": "https://example.com/article",
"title": "AI Developments 2024",
"snippet": "Recent advancements in AI...",
"date": "2024-01-15",
"score": 0.95
}
],
"count": 5
}
}Parameters:
prompt: (required) Search query string
limit: (optional) Maximum number of results (default: 5)
Note: Search results are returned immediately (synchronous). Each search costs 3 credits.
POST https://api.anakin.io/v1/agentic-searchAdvanced 4-stage search pipeline with automated research, web scraping, and comprehensive analysis. Ideal for in-depth research tasks.
4-Stage Pipeline:
Request Headers:
X-API-Key: your_api_key
Content-Type: application/jsonRequest Body:
{
"prompt": "Comprehensive analysis of quantum computing trends",
"useBrowser": true
}Response (202 Accepted):
{
"jobId": "agentic_abc123xyz",
"status": "pending",
"message": "Agentic search job submitted successfully"
}Poll for results using GET /v1/request/{jobId} (same as scraper jobs).
Parameters:
prompt: (required) Research query or question
useBrowser: (optional) Use browser for citation scraping (default: true)
Note: Agentic search costs 10 credits + 1 credit per URL scraped (for citations). Processed asynchronously, may take several minutes.
curl -X POST https://api.anakin.io/v1/request \
-H "X-API-Key: your_api_key" \
-H "Content-Type: application/json" \
-d '{
"url": "https://example.com",
"country": "us",
"useBrowser": false,
"generateJson": false
}'const response = await fetch('https://api.anakin.io/v1/request', {
method: 'POST',
headers: {
'X-API-Key': 'your_api_key',
'Content-Type': 'application/json'
},
body: JSON.stringify({
url: 'https://example.com',
country: 'us',
useBrowser: false,
generateJson: true, // Enable AI JSON extraction
forceFresh: false // Use cache if available
})
});
const data = await response.json();
console.log(data.jobId);
// Poll for results
const jobId = data.jobId;
const result = await fetch(`https://api.anakin.io/v1/request/${jobId}`, {
headers: { 'X-API-Key': 'your_api_key' }
});
const jobData = await result.json();
console.log(jobData.html); // HTML content
console.log(jobData.generatedJson); // AI-extracted JSON dataimport requests
import time
# Submit job
response = requests.post(
'https://api.anakin.io/v1/request',
headers={'X-API-Key': 'your_api_key'},
json={
'url': 'https://example.com',
'country': 'us',
'useBrowser': False,
'generateJson': True, # Enable AI JSON extraction
'forceFresh': False # Use cache if available
}
)
data = response.json()
job_id = data['jobId']
print(f"Job submitted: {job_id}")
# Poll for results
while True:
result = requests.get(
f'https://api.anakin.io/v1/request/{job_id}',
headers={'X-API-Key': 'your_api_key'}
)
job_data = result.json()
if job_data['status'] == 'completed':
print(job_data['html']) # HTML content
print(job_data.get('generatedJson')) # AI-extracted JSON data
break
elif job_data['status'] == 'failed':
print(f"Job failed: {job_data.get('error')}")
break
time.sleep(2) # Wait 2 seconds before polling againimport requests
import time
# Submit batch job (up to 10 URLs)
response = requests.post(
'https://api.anakin.io/v1/request',
headers={'X-API-Key': 'your_api_key'},
json={
'job_type': 'batch_url_scraper',
'urls': [
'https://example.com/page1',
'https://example.com/page2',
'https://example.com/page3'
],
'country': 'us',
'useBrowser': False,
'generateJson': True
}
)
data = response.json()
batch_job_id = data['jobId']
print(f"Batch job submitted: {batch_job_id}")
# Poll for batch results
while True:
result = requests.get(
f'https://api.anakin.io/v1/request/{batch_job_id}',
headers={'X-API-Key': 'your_api_key'}
)
batch_data = result.json()
if batch_data['status'] == 'completed':
# All URLs have been processed
for url_result in batch_data['results']:
print(f"URL {url_result['index']}: {url_result['url']}")
if url_result['status'] == 'completed':
print(f" HTML length: {len(url_result['html'])}")
else:
print(f" Failed: {url_result.get('error')}")
break
time.sleep(3) # Wait 3 seconds before polling againimport requests
# Perform a search (synchronous - returns immediately)
response = requests.post(
'https://api.anakin.io/v1/search',
headers={'X-API-Key': 'your_api_key'},
json={
'prompt': 'latest AI developments 2024',
'limit': 5
}
)
data = response.json()
print(f"Search completed: {data['status']}")
print(f"Query: {data['result']['query']}")
print(f"Answer: {data['result']['answer']}")
# Process search results
for result in data['result']['results']:
print(f"\nTitle: {result['title']}")
print(f"URL: {result['url']}")
print(f"Snippet: {result['snippet']}")
print(f"Score: {result.get('score', 'N/A')}")import requests
import time
# Submit agentic search job (asynchronous)
response = requests.post(
'https://api.anakin.io/v1/agentic-search',
headers={'X-API-Key': 'your_api_key'},
json={
'prompt': 'Comprehensive analysis of quantum computing trends',
'useBrowser': True
}
)
data = response.json()
job_id = data['jobId']
print(f"Agentic search job submitted: {job_id}")
# Poll for results (may take several minutes)
while True:
result = requests.get(
f'https://api.anakin.io/v1/request/{job_id}',
headers={'X-API-Key': 'your_api_key'}
)
job_data = result.json()
print(f"Status: {job_data['status']}")
if job_data['status'] == 'completed':
# Access comprehensive research results
print("\n=== Research Complete ===")
print(job_data.get('html')) # Full analysis
break
elif job_data['status'] == 'failed':
print(f"Job failed: {job_data.get('error')}")
break
time.sleep(10) # Wait 10 seconds before polling againurl (required for single URL)
The URL of the webpage to scrape. Must be a valid HTTP/HTTPS URL. Required when job_type is "url_scraper" (default).
urls (required for batch)
Array of URLs to scrape (1-10 URLs). Required when job_type is "batch_url_scraper".
job_type (optional)
Job type: "url_scraper" (default), "batch_url_scraper", or "browser_scraper". Determines how the request is processed.
country (optional)
Country code for proxy location (e.g., "us", "uk", "de", "jp"). Defaults to "us". 207 countries supported. Use GET /v1/countries to see all options.
useBrowser (optional)
If true, uses a full browser (headless Chrome with Playwright) instead of HTTP client. Defaults to false. Best for JavaScript-heavy sites, SPAs, or sites with dynamic content.
generateJson (optional)
If true, uses AI to extract structured JSON data from the scraped content. Defaults to false. Returned in the generatedJson field.
forceFresh (optional)
If true, bypasses cache and scrapes fresh content. Defaults to false. When false, cached results are served if available (no credits consumed).
400 Bad Request
{
"error": "Invalid URL format"
}Invalid request parameters or malformed URL.
401 Unauthorized
{
"error": "Unauthorized"
}Missing or invalid API key.
402 Payment Required
{
"error": "Insufficient credits. Please upgrade your plan or purchase more credits."
}Account has run out of credits.
404 Not Found
{
"error": "Job not found"
}The requested job ID does not exist.
503 Service Unavailable
{
"error": "Scraper service is unavailable. Please try again later."
}The scraper service is temporarily unavailable.