cURL

Scrape your first page from the command line using cURL — no language runtime required, just the API directly.

Hit the AnakinScraper REST API directly with curl. Useful for quick tests, shell scripts, CI pipelines, or any environment without a language runtime.


Authentication

Set your API key as an environment variable. Get a key from the Dashboard.

export ANAKIN_API_KEY=ak-your-key-here

The base URL is https://api.anakin.io/v1. Every request authenticates via the X-API-Key header.


Install

curl is preinstalled on macOS, Linux, and most CI runners. The polling script below also uses jq for JSON parsing — install it once if you don't have it:

# macOS
brew install jq

# Debian / Ubuntu
sudo apt-get install jq

Submit a single request

Fire-and-forget submit, useful when you'll poll separately or check results later in the dashboard:

curl -X POST https://api.anakin.io/v1/url-scraper \
  -H "X-API-Key: $ANAKIN_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"url": "https://example.com"}'

Response:

{
  "jobId": "job_abc123xyz",
  "status": "pending"
}

Poll the job:

curl https://api.anakin.io/v1/url-scraper/job_abc123xyz \
  -H "X-API-Key: $ANAKIN_API_KEY"

Submit + poll in one script

Save as scrape.sh (chmod +x scrape.sh):

#!/bin/bash
set -e
: "${ANAKIN_API_KEY:?ANAKIN_API_KEY is not set}"

BASE="https://api.anakin.io/v1"
URL="${1:-https://example.com}"

submitted=$(curl -sS -X POST "$BASE/url-scraper" \
  -H "X-API-Key: $ANAKIN_API_KEY" \
  -H "Content-Type: application/json" \
  -d "{\"url\": \"$URL\"}")
job_id=$(echo "$submitted" | jq -r '.jobId')

for _ in $(seq 1 60); do
  job=$(curl -sS "$BASE/url-scraper/$job_id" -H "X-API-Key: $ANAKIN_API_KEY") \
    || { sleep 3; continue; }     # retry transient errors
  status=$(echo "$job" | jq -r '.status')
  case "$status" in
    completed) echo "$job" | jq -r '.markdown'; exit 0 ;;
    failed)    echo "scrape failed: $(echo "$job" | jq -r '.error')" >&2; exit 1 ;;
  esac
  sleep 3
done
echo "timed out after 3 minutes" >&2
exit 1

Run it:

./scrape.sh https://example.com

What this does

  1. Submits the URL to /url-scraper and gets back a jobId.
  2. Polls /url-scraper/{jobId} every 3 seconds (up to 60 attempts = 3 minutes).
  3. Retries transient curl errors silently — only surfaces real failures.
  4. Prints the final markdown when the job completes.

Most jobs finish in 3–15 seconds.


Go further

Extract structured JSON with AI

Replace the submit body with generateJson: true to have AI return structured data:

curl -sS -X POST "$BASE/url-scraper" \
  -H "X-API-Key: $ANAKIN_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"url": "https://news.ycombinator.com", "generateJson": true}'

The completed response includes a generatedJson field with structured data inferred from the page.

Scrape JavaScript-heavy sites

For SPAs and dynamically-loaded pages, add useBrowser: true:

curl -sS -X POST "$BASE/url-scraper" \
  -H "X-API-Key: $ANAKIN_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"url": "https://example.com/spa", "useBrowser": true}'

Only use browser mode when needed — standard scraping is faster and cheaper.

Search the web with AI

The Search API is synchronous — no polling needed:

curl -sS -X POST https://api.anakin.io/v1/search \
  -H "X-API-Key: $ANAKIN_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"prompt": "best web scraping libraries 2025"}'

Use it from CI

Drop the polling script into a GitHub Actions step or a cron job. The ANAKIN_API_KEY should come from a secret store (GitHub Secrets, Vault, Doppler, etc.) — never hard-code it:

# .github/workflows/scrape.yml
- name: Scrape pricing page
  env:
    ANAKIN_API_KEY: ${{ secrets.ANAKIN_API_KEY }}
  run: ./scrape.sh https://example.com/pricing > pricing.md

Next steps