PHP

Scrape your first page from PHP using the built-in cURL extension — no Composer dependencies required.

Submit a scrape, poll for the result, and handle transient errors — all with PHP's bundled curl extension.


Authentication

Set your API key as an environment variable. Get a key from the Dashboard.

export ANAKIN_API_KEY=ak-your-key-here

The base URL is https://api.anakin.io/v1. Every request authenticates via the X-API-Key header.


Install

No Composer packages needed — curl and json are PHP extensions enabled by default. Requires PHP 8.0+ for the throw expression in the example below.

php --version  # confirm 8.0+
php -m | grep -E "curl|json"  # confirm extensions present

Scrape a page

Save as quickstart.php:

<?php
$apiKey = getenv("ANAKIN_API_KEY") ?: throw new Exception("ANAKIN_API_KEY is not set");
$base   = "https://api.anakin.io/v1";

function request(string $method, string $path, ?array $body = null): ?array {
    global $apiKey, $base;
    $ch = curl_init($base . $path);
    curl_setopt_array($ch, [
        CURLOPT_RETURNTRANSFER => true,
        CURLOPT_CUSTOMREQUEST  => $method,
        CURLOPT_HTTPHEADER     => ["X-API-Key: $apiKey", "Content-Type: application/json"],
        CURLOPT_TIMEOUT        => 30,
    ]);
    if ($body !== null) {
        curl_setopt($ch, CURLOPT_POSTFIELDS, json_encode($body));
    }
    $response = curl_exec($ch);
    $err = curl_error($ch);
    curl_close($ch);
    return $err ? null : json_decode($response, true);
}

function scrape(string $url): array {
    $submitted = request("POST", "/url-scraper", ["url" => $url]);
    $jobId = $submitted["jobId"];

    for ($i = 0; $i < 60; $i++) {
        $job = request("GET", "/url-scraper/" . $jobId);
        if ($job === null) {
            sleep(3); // retry transient errors
            continue;
        }
        if ($job["status"] === "completed") return $job;
        if ($job["status"] === "failed") {
            throw new Exception("scrape failed: " . ($job["error"] ?? ""));
        }
        sleep(3);
    }
    throw new Exception("timed out after 3 minutes");
}

$job = scrape("https://example.com");
echo $job["markdown"];

Run it:

php quickstart.php

What this does

  1. Submits https://example.com to /url-scraper and gets back a jobId.
  2. Polls /url-scraper/{jobId} every 3 seconds (up to 60 attempts = 3 minutes).
  3. Retries transient cURL errors silently — only surfaces real failures.
  4. Echoes the final markdown when the job completes.

Most jobs finish in 3–15 seconds.


Go further

Extract structured JSON with AI

Replace the submit body with generateJson: true to have AI return structured data:

$submitted = request("POST", "/url-scraper", [
    "url"          => "https://news.ycombinator.com",
    "generateJson" => true,
]);

The completed response includes a generatedJson field with structured data inferred from the page.

Scrape JavaScript-heavy sites

For SPAs and dynamically-loaded pages, add useBrowser: true:

$submitted = request("POST", "/url-scraper", [
    "url"        => "https://example.com/spa",
    "useBrowser" => true,
]);

Only use browser mode when needed — standard scraping is faster and cheaper.


Use it from Laravel

Drop the request and scrape functions into a service class and call from a queued job — the polling loop blocks for up to 3 minutes per URL, so background jobs are the natural fit:

// app/Jobs/ScrapeUrlJob.php
class ScrapeUrlJob implements ShouldQueue {
    public function __construct(public string $url) {}

    public function handle(AnakinScraper $scraper): void {
        $job = $scraper->scrape($this->url);
        Page::create(["url" => $this->url, "markdown" => $job["markdown"]]);
    }
}

Next steps