API Documentation

Complete API reference for AnakinScraper. Learn how to integrate powerful web scraping into your application.

Quick Start

Get started with AnakinScraper in minutes. Follow these simple steps:

  1. 1

    Sign up for an account

    Create your free account by signing up here

  2. 2

    Get your API key

    Navigate to your dashboard and in Api keys section you can generate a new API key.

  3. 3

    Make your first API call

    Use the X-API-Key header with your key to authenticate requests (see examples below)

Authentication

All API requests require authentication using the X-API-Key header. Include your API key in every request.

API Endpoints

Submit Scrape Job

POST https://api.anakin.io/v1/request

Request Headers:

X-API-Key: your_api_key
Content-Type: application/json

Request Body (Single URL):

{
  "url": "https://example.com",
  "country": "us",
  "useBrowser": false,
  "generateJson": false,
  "forceFresh": false,
  "job_type": "url_scraper"
}

Response (202 Accepted):

{
  "jobId": "job_abc123xyz",
  "status": "pending"
}

The job is processed asynchronously. Use the jobId to check status and retrieve results.

Batch URL Scraping (up to 10 URLs)

POST https://api.anakin.io/v1/request

Request Body (Batch Mode):

{
  "job_type": "batch_url_scraper",
  "urls": [
    "https://example.com/page1",
    "https://example.com/page2",
    "https://example.com/page3"
  ],
  "country": "us",
  "useBrowser": false,
  "generateJson": false
}

Batch mode allows you to scrape up to 10 URLs in a single request. All URLs are processed in parallel, and you receive a parent job ID to track overall progress.

Get Job Status & Results

GET https://api.anakin.io/v1/request/{id}

Request Headers:

X-API-Key: your_api_key

Response (200 OK when completed) - Single URL:

{
  "id": "job_abc123xyz",
  "status": "completed",
  "url": "https://example.com",
  "jobType": "url_scraper",
  "country": "us",
  "html": "<html>...</html>",
  "markdown": "# Page content...",
  "generatedJson": {"data": {...}},
  "cached": false,
  "error": null,
  "createdAt": "2024-01-01T12:00:00Z",
  "completedAt": "2024-01-01T12:00:05Z",
  "durationMs": 5000
}

Response (200 OK when completed) - Batch Job:

{
  "id": "batch_abc123",
  "status": "completed",
  "jobType": "batch_url_scraper",
  "country": "us",
  "urls": ["https://example.com/page1", "https://example.com/page2"],
  "results": [
    {
      "index": 0,
      "url": "https://example.com/page1",
      "status": "completed",
      "html": "<html>...</html>",
      "markdown": "# Content...",
      "generatedJson": {"data": {...}},
      "cached": false,
      "durationMs": 3000
    },
    {
      "index": 1,
      "url": "https://example.com/page2",
      "status": "failed",
      "error": "Connection timeout",
      "durationMs": 5000
    }
  ],
  "createdAt": "2024-01-01T12:00:00Z",
  "completedAt": "2024-01-01T12:00:10Z",
  "durationMs": 10000
}

Response Fields:

status: Job status - pending, processing, completed, or failed

html: Raw HTML content (only present when status is completed)

markdown: Markdown version of the content (optional, when available)

generatedJson: AI-extracted structured JSON data (only present when generateJson: true)

cached: Whether the result was served from cache (true means no credits consumed)

error: Error message (only present when status is failed)

durationMs: Time taken to complete the scrape in milliseconds

results: For batch jobs, array of results for each URL (only when all URLs are finished)

Get Job History

GET https://api.anakin.io/v1/history?limit=50&offset=0

Request Headers:

X-API-Key: your_api_key

Query Parameters:

limit: Number of jobs to return (1-1000, default: 50)

offset: Pagination offset (default: 0)

Returns an array of jobs with batch aggregation. Child jobs are excluded from the top-level list and aggregated under their parent batch job.

Get Available Countries

GET https://api.anakin.io/v1/countries

Returns a list of all 207 supported country codes for proxy routing.

Response (200 OK):

[
  { "name": "United States", "code": "us" },
  { "name": "United Kingdom", "code": "uk" },
  { "name": "Germany", "code": "de" },
  { "name": "Japan", "code": "jp" },
  ...
]

Search API3 credits

POST https://api.anakin.io/v1/search

Perform AI-powered web searches using Perplexity API. Returns search results with citations, snippets, and relevant content.

Request Headers:

X-API-Key: your_api_key
Content-Type: application/json

Request Body:

{
  "prompt": "latest AI developments 2024",
  "limit": 5
}

Response (200 OK):

{
  "jobId": "search_abc123xyz",
  "status": "completed",
  "result": {
    "query": "latest AI developments 2024",
    "answer": "Summary of search results...",
    "results": [
      {
        "url": "https://example.com/article",
        "title": "AI Developments 2024",
        "snippet": "Recent advancements in AI...",
        "date": "2024-01-15",
        "score": 0.95
      }
    ],
    "count": 5
  }
}

Parameters:

prompt: (required) Search query string

limit: (optional) Maximum number of results (default: 5)

Note: Search results are returned immediately (synchronous). Each search costs 3 credits.

Agentic Search APIAsync

POST https://api.anakin.io/v1/agentic-search

Advanced 4-stage search pipeline with automated research, web scraping, and comprehensive analysis. Ideal for in-depth research tasks.

4-Stage Pipeline:

  1. Initial search query refinement
  2. Web search and citation discovery
  3. Automated scraping of top citations
  4. Final comprehensive analysis and synthesis

Request Headers:

X-API-Key: your_api_key
Content-Type: application/json

Request Body:

{
  "prompt": "Comprehensive analysis of quantum computing trends",
  "useBrowser": true
}

Response (202 Accepted):

{
  "jobId": "agentic_abc123xyz",
  "status": "pending",
  "message": "Agentic search job submitted successfully"
}

Poll for results using GET /v1/request/{jobId} (same as scraper jobs).

Parameters:

prompt: (required) Research query or question

useBrowser: (optional) Use browser for citation scraping (default: true)

Note: Agentic search costs 10 credits + 1 credit per URL scraped (for citations). Processed asynchronously, may take several minutes.

Code Examples

cURL

curl -X POST https://api.anakin.io/v1/request \
  -H "X-API-Key: your_api_key" \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://example.com",
    "country": "us",
    "useBrowser": false,
    "generateJson": false
  }'

JavaScript

const response = await fetch('https://api.anakin.io/v1/request', {
  method: 'POST',
  headers: {
    'X-API-Key': 'your_api_key',
    'Content-Type': 'application/json'
  },
  body: JSON.stringify({
    url: 'https://example.com',
    country: 'us',
    useBrowser: false,
    generateJson: true,  // Enable AI JSON extraction
    forceFresh: false    // Use cache if available
  })
});

const data = await response.json();
console.log(data.jobId);

// Poll for results
const jobId = data.jobId;
const result = await fetch(`https://api.anakin.io/v1/request/${jobId}`, {
  headers: { 'X-API-Key': 'your_api_key' }
});
const jobData = await result.json();
console.log(jobData.html); // HTML content
console.log(jobData.generatedJson); // AI-extracted JSON data

Python (Single URL)

import requests
import time

# Submit job
response = requests.post(
    'https://api.anakin.io/v1/request',
    headers={'X-API-Key': 'your_api_key'},
    json={
        'url': 'https://example.com',
        'country': 'us',
        'useBrowser': False,
        'generateJson': True,  # Enable AI JSON extraction
        'forceFresh': False    # Use cache if available
    }
)

data = response.json()
job_id = data['jobId']
print(f"Job submitted: {job_id}")

# Poll for results
while True:
    result = requests.get(
        f'https://api.anakin.io/v1/request/{job_id}',
        headers={'X-API-Key': 'your_api_key'}
    )
    job_data = result.json()

    if job_data['status'] == 'completed':
        print(job_data['html'])  # HTML content
        print(job_data.get('generatedJson'))  # AI-extracted JSON data
        break
    elif job_data['status'] == 'failed':
        print(f"Job failed: {job_data.get('error')}")
        break

    time.sleep(2)  # Wait 2 seconds before polling again

Python (Batch URLs)

import requests
import time

# Submit batch job (up to 10 URLs)
response = requests.post(
    'https://api.anakin.io/v1/request',
    headers={'X-API-Key': 'your_api_key'},
    json={
        'job_type': 'batch_url_scraper',
        'urls': [
            'https://example.com/page1',
            'https://example.com/page2',
            'https://example.com/page3'
        ],
        'country': 'us',
        'useBrowser': False,
        'generateJson': True
    }
)

data = response.json()
batch_job_id = data['jobId']
print(f"Batch job submitted: {batch_job_id}")

# Poll for batch results
while True:
    result = requests.get(
        f'https://api.anakin.io/v1/request/{batch_job_id}',
        headers={'X-API-Key': 'your_api_key'}
    )
    batch_data = result.json()

    if batch_data['status'] == 'completed':
        # All URLs have been processed
        for url_result in batch_data['results']:
            print(f"URL {url_result['index']}: {url_result['url']}")
            if url_result['status'] == 'completed':
                print(f"  HTML length: {len(url_result['html'])}")
            else:
                print(f"  Failed: {url_result.get('error')}")
        break

    time.sleep(3)  # Wait 3 seconds before polling again

Search API (Python)

import requests

# Perform a search (synchronous - returns immediately)
response = requests.post(
    'https://api.anakin.io/v1/search',
    headers={'X-API-Key': 'your_api_key'},
    json={
        'prompt': 'latest AI developments 2024',
        'limit': 5
    }
)

data = response.json()
print(f"Search completed: {data['status']}")
print(f"Query: {data['result']['query']}")
print(f"Answer: {data['result']['answer']}")

# Process search results
for result in data['result']['results']:
    print(f"\nTitle: {result['title']}")
    print(f"URL: {result['url']}")
    print(f"Snippet: {result['snippet']}")
    print(f"Score: {result.get('score', 'N/A')}")

Agentic Search API (Python)

import requests
import time

# Submit agentic search job (asynchronous)
response = requests.post(
    'https://api.anakin.io/v1/agentic-search',
    headers={'X-API-Key': 'your_api_key'},
    json={
        'prompt': 'Comprehensive analysis of quantum computing trends',
        'useBrowser': True
    }
)

data = response.json()
job_id = data['jobId']
print(f"Agentic search job submitted: {job_id}")

# Poll for results (may take several minutes)
while True:
    result = requests.get(
        f'https://api.anakin.io/v1/request/{job_id}',
        headers={'X-API-Key': 'your_api_key'}
    )
    job_data = result.json()
    
    print(f"Status: {job_data['status']}")

    if job_data['status'] == 'completed':
        # Access comprehensive research results
        print("\n=== Research Complete ===")
        print(job_data.get('html'))  # Full analysis
        break
    elif job_data['status'] == 'failed':
        print(f"Job failed: {job_data.get('error')}")
        break

    time.sleep(10)  # Wait 10 seconds before polling again

Request Parameters

url (required for single URL)

The URL of the webpage to scrape. Must be a valid HTTP/HTTPS URL. Required when job_type is "url_scraper" (default).

urls (required for batch)

Array of URLs to scrape (1-10 URLs). Required when job_type is "batch_url_scraper".

job_type (optional)

Job type: "url_scraper" (default), "batch_url_scraper", or "browser_scraper". Determines how the request is processed.

country (optional)

Country code for proxy location (e.g., "us", "uk", "de", "jp"). Defaults to "us". 207 countries supported. Use GET /v1/countries to see all options.

useBrowser (optional)

If true, uses a full browser (headless Chrome with Playwright) instead of HTTP client. Defaults to false. Best for JavaScript-heavy sites, SPAs, or sites with dynamic content.

generateJson (optional)

If true, uses AI to extract structured JSON data from the scraped content. Defaults to false. Returned in the generatedJson field.

forceFresh (optional)

If true, bypasses cache and scrapes fresh content. Defaults to false. When false, cached results are served if available (no credits consumed).

Error Responses

400 Bad Request

{
  "error": "Invalid URL format"
}

Invalid request parameters or malformed URL.

401 Unauthorized

{
  "error": "Unauthorized"
}

Missing or invalid API key.

402 Payment Required

{
  "error": "Insufficient credits. Please upgrade your plan or purchase more credits."
}

Account has run out of credits.

404 Not Found

{
  "error": "Job not found"
}

The requested job ID does not exist.

503 Service Unavailable

{
  "error": "Scraper service is unavailable. Please try again later."
}

The scraper service is temporarily unavailable.