Ruby

Scrape your first page from Ruby using only the standard library — no gems or SDK required.

Submit a scrape, poll for the result, and handle transient errors — all with net/http and json from Ruby's standard library.


Authentication

Set your API key as an environment variable. Get a key from the Dashboard.

export ANAKIN_API_KEY=ak-your-key-here

The base URL is https://api.anakin.io/v1. Every request authenticates via the X-API-Key header.


Install

No gems needed — net/http and json ship with Ruby. This works in any Ruby 2.7+ project, including Rails.


Scrape a page

Save as quickstart.rb:

require "net/http"
require "json"
require "uri"

BASE_URL = "https://api.anakin.io/v1"
API_KEY  = ENV.fetch("ANAKIN_API_KEY") { raise "ANAKIN_API_KEY is not set" }

def request(method, path, body = nil)
  uri  = URI(BASE_URL + path)
  http = Net::HTTP.new(uri.host, uri.port)
  http.use_ssl = true
  http.read_timeout = 30

  req = Net::HTTPGenericRequest.new(method, !body.nil?, true, uri.request_uri)
  req["X-API-Key"]    = API_KEY
  req["Content-Type"] = "application/json"
  req.body = body.to_json if body

  JSON.parse(http.request(req).body)
end

def scrape(url)
  submitted = request("POST", "/url-scraper", { url: url })
  job_id    = submitted["jobId"]

  60.times do
    begin
      job = request("GET", "/url-scraper/#{job_id}")
    rescue StandardError
      sleep 3 # retry transient errors
      next
    end

    case job["status"]
    when "completed" then return job
    when "failed"    then raise "scrape failed: #{job['error']}"
    end
    sleep 3
  end
  raise "timed out after 3 minutes"
end

job = scrape("https://example.com")
puts job["markdown"]

Run it:

ruby quickstart.rb

What this does

  1. Submits https://example.com to /url-scraper and gets back a jobId.
  2. Polls /url-scraper/{jobId} every 3 seconds (up to 60 attempts = 3 minutes).
  3. Retries transient network errors silently — only surfaces real failures.
  4. Prints the final markdown when the job completes.

Most jobs finish in 3–15 seconds.


Go further

Extract structured JSON with AI

Pass generateJson: true to have AI return structured data:

submitted = request("POST", "/url-scraper", {
  url: "https://news.ycombinator.com",
  generateJson: true
})

The completed response includes a generatedJson field with structured data inferred from the page.

Scrape JavaScript-heavy sites

For SPAs and dynamically-loaded pages, add useBrowser: true:

submitted = request("POST", "/url-scraper", {
  url: "https://example.com/spa",
  useBrowser: true
})

Only use browser mode when needed — standard scraping is faster and cheaper.


Use it from Rails

Drop the request and scrape methods into a service object (e.g. app/services/anakin_scraper.rb) and call from a job:

# app/jobs/scrape_url_job.rb
class ScrapeUrlJob < ApplicationJob
  queue_as :default

  def perform(url)
    job = AnakinScraper.scrape(url)
    Page.create!(url: url, markdown: job["markdown"])
  end
end

Background jobs are the natural fit — the polling loop blocks for up to 3 minutes per URL.


Next steps