Java
Scrape your first page from Java using the built-in HttpClient — works with any Java 11+ project.
Submit a scrape, poll for the result, and handle transient errors — using java.net.http.HttpClient from the Java 11+ standard library plus Jackson for JSON parsing.
Authentication
Set your API key as an environment variable. Get a key from the Dashboard.
export ANAKIN_API_KEY=ak-your-key-hereThe base URL is https://api.anakin.io/v1. Every request authenticates via the X-API-Key header.
Install
HttpClient is in the Java 11+ standard library. Jackson is the de facto JSON library and is already on the classpath in nearly every Spring Boot, Quarkus, or Micronaut project.
Maven:
<dependency>
<groupId>com.fasterxml.jackson.core</groupId>
<artifactId>jackson-databind</artifactId>
<version>2.17.0</version>
</dependency>Gradle:
implementation 'com.fasterxml.jackson.core:jackson-databind:2.17.0'Scrape a page
Save as Quickstart.java:
import java.net.URI;
import java.net.http.HttpClient;
import java.net.http.HttpRequest;
import java.net.http.HttpResponse;
import java.time.Duration;
import java.util.Map;
import com.fasterxml.jackson.databind.JsonNode;
import com.fasterxml.jackson.databind.ObjectMapper;
public class Quickstart {
static final String BASE = "https://api.anakin.io/v1";
static final String API_KEY = System.getenv("ANAKIN_API_KEY");
static final HttpClient HTTP = HttpClient.newBuilder()
.connectTimeout(Duration.ofSeconds(30)).build();
static final ObjectMapper JSON = new ObjectMapper();
static JsonNode request(String method, String path, Object body) throws Exception {
var publisher = body == null
? HttpRequest.BodyPublishers.noBody()
: HttpRequest.BodyPublishers.ofString(JSON.writeValueAsString(body));
var req = HttpRequest.newBuilder(URI.create(BASE + path))
.header("X-API-Key", API_KEY)
.header("Content-Type", "application/json")
.method(method, publisher).build();
var resp = HTTP.send(req, HttpResponse.BodyHandlers.ofString());
return JSON.readTree(resp.body());
}
static JsonNode scrape(String url) throws Exception {
var submitted = request("POST", "/url-scraper", Map.of("url", url));
var jobId = submitted.get("jobId").asText();
for (int i = 0; i < 60; i++) {
JsonNode job;
try { job = request("GET", "/url-scraper/" + jobId, null); }
catch (Exception e) { Thread.sleep(3000); continue; } // retry transient errors
switch (job.get("status").asText()) {
case "completed": return job;
case "failed":
throw new RuntimeException("scrape failed: " + job.path("error").asText(""));
}
Thread.sleep(3000);
}
throw new RuntimeException("timed out after 3 minutes");
}
public static void main(String[] args) throws Exception {
if (API_KEY == null) throw new RuntimeException("ANAKIN_API_KEY is not set");
var job = scrape("https://example.com");
System.out.println(job.get("markdown").asText());
}
}Run it (with Maven/Gradle handling Jackson on the classpath):
mvn compile exec:java -Dexec.mainClass=Quickstart
# or with Gradle: ./gradlew runWhat this does
- Submits
https://example.comto/url-scraperand gets back ajobId. - Polls
/url-scraper/{jobId}every 3 seconds (up to 60 attempts = 3 minutes). - Retries transient I/O errors silently — only surfaces real failures.
- Prints the final
markdownwhen the job completes.
Most jobs finish in 3–15 seconds.
Go further
Extract structured JSON with AI
Replace the submit body with generateJson: true to have AI return structured data:
var submitted = request("POST", "/url-scraper", Map.of(
"url", "https://news.ycombinator.com",
"generateJson", true
));The completed response includes a generatedJson field with structured data inferred from the page.
Scrape JavaScript-heavy sites
For SPAs and dynamically-loaded pages, add useBrowser: true:
var submitted = request("POST", "/url-scraper", Map.of(
"url", "https://example.com/spa",
"useBrowser", true
));Only use browser mode when needed — standard scraping is faster and cheaper.
Use it from Spring Boot
Wrap the request and scrape methods in a @Service and call from a @Async method or a Spring Batch job — the polling loop blocks for up to 3 minutes per URL, so background execution is the natural fit. For non-blocking use, switch to HTTP.sendAsync() and chain the polling with CompletableFuture.