Node.js

Scrape your first page from Node.js using the built-in fetch API — no npm install required.

Submit a scrape, poll for the result, and handle transient errors — using fetch, which is built into Node.js 18+ and every modern browser. No axios, no node-fetch, no dependencies.


Authentication

Set your API key as an environment variable. Get a key from the Dashboard.

export ANAKIN_API_KEY=ak-your-key-here

The base URL is https://api.anakin.io/v1. Every request authenticates via the X-API-Key header.


Install

fetch is built into Node.js 18+ and Bun and Deno. Confirm:

node --version  # v18.0.0 or later

No npm packages needed for the basic flow. Just an .mjs (or "type": "module" in package.json) for top-level await.


Scrape a page

Save as quickstart.mjs:

const BASE = "https://api.anakin.io/v1"
const API_KEY = process.env.ANAKIN_API_KEY
if (!API_KEY) throw new Error("ANAKIN_API_KEY is not set")

async function request(method, path, body) {
  try {
    const resp = await fetch(BASE + path, {
      method,
      headers: { "X-API-Key": API_KEY, "Content-Type": "application/json" },
      body: body ? JSON.stringify(body) : undefined,
      signal: AbortSignal.timeout(30_000),
    })
    return await resp.json()
  } catch {
    return null // caller retries on null
  }
}

async function scrape(url) {
  const submitted = await request("POST", "/url-scraper", { url })
  const jobId = submitted.jobId

  for (let i = 0; i < 60; i++) {
    const job = await request("GET", `/url-scraper/${jobId}`)
    if (!job) {
      await new Promise(r => setTimeout(r, 3000)) // retry transient errors
      continue
    }
    if (job.status === "completed") return job
    if (job.status === "failed") {
      throw new Error(`scrape failed: ${job.error}`)
    }
    await new Promise(r => setTimeout(r, 3000))
  }
  throw new Error("timed out after 3 minutes")
}

const job = await scrape("https://example.com")
console.log(job.markdown)

Run it:

node quickstart.mjs

What this does

  1. Submits https://example.com to /url-scraper and gets back a jobId.
  2. Polls /url-scraper/{jobId} every 3 seconds (up to 60 attempts = 3 minutes).
  3. Retries transient fetch errors silently — only surfaces real failures.
  4. Prints the final markdown when the job completes.

Most jobs finish in 3–15 seconds.


Go further

Extract structured JSON with AI

Replace the submit body with generateJson: true to have AI return structured data:

const submitted = await request("POST", "/url-scraper", {
  url: "https://news.ycombinator.com",
  generateJson: true,
})

The completed response includes a generatedJson field with structured data inferred from the page.

Scrape JavaScript-heavy sites

For SPAs and dynamically-loaded pages, add useBrowser: true:

const submitted = await request("POST", "/url-scraper", {
  url: "https://example.com/spa",
  useBrowser: true,
})

Only use browser mode when needed — standard scraping is faster and cheaper.


Use it from Next.js / Express / NestJS

Drop the request and scrape functions into a service module and call from a queued background job (BullMQ, Inngest, Trigger.dev) — the polling loop awaits up to 3 minutes per URL, so background execution is the natural fit. For Next.js specifically, run from a route handler with export const maxDuration = 300 if your platform supports it, or push to a queue:

// app/api/scrape/route.js
import { Queue } from "bullmq"
const scrapeQueue = new Queue("scrape")

export async function POST(req) {
  const { url } = await req.json()
  await scrapeQueue.add("scrape", { url })
  return Response.json({ queued: true })
}

TypeScript

For typed responses, define a minimal interface and cast — no SDK needed:

type Job = {
  id: string
  status: "pending" | "completed" | "failed"
  markdown?: string
  generatedJson?: unknown
  error?: string
}

const job = await scrape("https://example.com") as Job

Next steps