Rust

Scrape your first page from Rust using the lightweight ureq crate — single-dependency, sync, no async runtime required.

Submit a scrape, poll for the result, and handle transient errors — using ureq (a minimal sync HTTP client) and serde_json.


Authentication

Set your API key as an environment variable. Get a key from the Dashboard.

export ANAKIN_API_KEY=ak-your-key-here

The base URL is https://api.anakin.io/v1. Every request authenticates via the X-API-Key header.


Install

Rust's stdlib doesn't ship an HTTP client. ureq is the lightest sensible choice — synchronous, no async runtime, single small crate. For async or reqwest users, the same logic translates trivially.

Add to Cargo.toml:

[dependencies]
ureq = { version = "2", features = ["json"] }
serde_json = "1"

Scrape a page

Save as src/main.rs:

use std::env;
use std::thread::sleep;
use std::time::Duration;
use serde_json::{json, Value};

fn main() -> Result<(), Box<dyn std::error::Error>> {
    let api_key = env::var("ANAKIN_API_KEY")
        .map_err(|_| "ANAKIN_API_KEY is not set")?;
    let base = "https://api.anakin.io/v1";
    let agent = ureq::AgentBuilder::new()
        .timeout(Duration::from_secs(30))
        .build();

    let submitted: Value = agent.post(&format!("{base}/url-scraper"))
        .set("X-API-Key", &api_key)
        .send_json(json!({ "url": "https://example.com" }))?
        .into_json()?;
    let job_id = submitted["jobId"].as_str().ok_or("missing jobId")?;

    for _ in 0..60 {
        let resp = agent.get(&format!("{base}/url-scraper/{job_id}"))
            .set("X-API-Key", &api_key)
            .call();
        let job: Value = match resp {
            Ok(r) => r.into_json()?,
            Err(_) => { sleep(Duration::from_secs(3)); continue; } // retry transient errors
        };
        match job["status"].as_str() {
            Some("completed") => {
                println!("{}", job["markdown"].as_str().unwrap_or(""));
                return Ok(());
            }
            Some("failed") => {
                return Err(format!("scrape failed: {}", job["error"]).into());
            }
            _ => {}
        }
        sleep(Duration::from_secs(3));
    }
    Err("timed out after 3 minutes".into())
}

Run it:

cargo run

What this does

  1. Submits https://example.com to /url-scraper and gets back a jobId.
  2. Polls /url-scraper/{jobId} every 3 seconds (up to 60 attempts = 3 minutes).
  3. Retries transient network errors silently — only surfaces real failures.
  4. Prints the final markdown when the job completes.

Most jobs finish in 3–15 seconds.


Go further

Extract structured JSON with AI

Replace the submit body with generateJson: true to have AI return structured data:

let submitted: Value = agent.post(&format!("{base}/url-scraper"))
    .set("X-API-Key", &api_key)
    .send_json(json!({
        "url": "https://news.ycombinator.com",
        "generateJson": true
    }))?
    .into_json()?;

The completed response includes a generatedJson field with structured data inferred from the page.

Scrape JavaScript-heavy sites

For SPAs and dynamically-loaded pages, add useBrowser: true:

.send_json(json!({
    "url": "https://example.com/spa",
    "useBrowser": true
}))?

Only use browser mode when needed — standard scraping is faster and cheaper.


Async with reqwest

If you're already on Tokio, swap ureq for reqwest and .await the requests — the loop structure is identical. The polling sleep becomes tokio::time::sleep(Duration::from_secs(3)).await.


Next steps