Dify

AnakinScraper plugin for Dify

Web scraping and AI-powered search plugin for Dify. Extract data from any website, perform intelligent web searches, and conduct deep research — all inside your Dify workflows and agents.

MarketplaceDify Plugin Store
SourceGitHub
TypeTool Plugin
Version0.0.1
Tools5

Key features

  • Anti-detection — Proxy routing across 207 countries prevents blocking
  • Intelligent Caching — Up to 30x faster on repeated requests
  • AI Extraction — Convert any webpage into structured JSON
  • Browser Automation — Full headless Chrome support for SPAs and JS-heavy sites
  • Session Management — Authenticated scraping with encrypted session storage (AES-256-GCM)
  • Batch Processing — Submit multiple URLs in a single request

Setup

1. Get your API key

  1. Sign up at anakin.io/signup
  2. Go to your Dashboard
  3. Copy your API key (starts with ask_)

2. Install in Dify

  1. Install the Anakin plugin in your Dify workspace from the Plugin Store
  2. Go to Plugins > Anakin > Configure
  3. Enter a name for the authorization (e.g., "Production")
  4. Paste your API key
  5. Click Save

Tools

1. URL Scraper

Scrapes a single URL, returning HTML, markdown, and optionally structured JSON.

ParameterTypeRequiredDefaultDescription
urlstringYesTarget URL to scrape (HTTP/HTTPS)
countrystringNousProxy location from 207 countries
use_browserbooleanNofalseEnable headless Chrome for JavaScript-heavy sites
generate_jsonbooleanNofalseUse AI to extract structured data
session_idstringNoBrowser session ID for authenticated pages

Response includes: Raw HTML, cleaned HTML, markdown conversion, structured JSON (if generate_json enabled), cache status, timing metrics.


2. Batch URL Scraper

Scrape up to 10 URLs simultaneously in parallel.

ParameterTypeRequiredDefaultDescription
urlsstringYesComma-separated list of URLs (1–10)
countrystringNousProxy location from 207 countries
use_browserbooleanNofalseEnable headless Chrome for JavaScript-heavy sites
generate_jsonbooleanNofalseUse AI to extract structured data from each page

Synchronous AI-powered web search returning results with citations and relevance scoring. Results are returned immediately without polling.

ParameterTypeRequiredDefaultDescription
promptstringYesSearch query or question
limitnumberNo5Maximum results to return

Response includes: Array of results with URLs, titles, snippets, publication dates, last updated timestamps.


Multi-stage automated research pipeline combining search, scraping, and AI synthesis. Takes 1–5 minutes.

ParameterTypeRequiredDescription
promptstringYesResearch question or topic

Response includes: AI-generated comprehensive answers, summaries, structured findings, citations with source URLs, scraped source data, processing metrics.


5. Custom Web Scraper

Execute pre-configured scraper templates for domain-specific structured data extraction.

ParameterTypeRequiredDescription
urlstringYesTarget URL to scrape
scraper_codestringYesConfiguration identifier
scraper_paramsstringNoJSON string of scraper-specific parameters

Response: Structured JSON matching the scraper's defined schema.


Examples

In a Workflow

  1. Add a Tool node to your workflow
  2. Select Anakin and choose your tool
  3. Configure parameters (e.g., enter URL, enable generate_json)
  4. Connect to the next node for processing

In an Agent

  1. Create an Agent app
  2. Add Anakin tools to the agent's toolset
  3. The agent will automatically use scraping/search based on user queries

Scraping with AI extraction

Tool: URL Scraper
URL: https://example.com/products
Generate JSON: true

Returns structured product data automatically extracted by AI.

Authenticated scraping

Tool: URL Scraper
URL: https://example.com/dashboard
Session ID: your-session-id-from-dashboard
Use Browser: true

Scrapes pages that require login using your saved browser session. Learn more about Browser Sessions.


Processing times

ToolTypeTypical Duration
URL ScraperAsync3–15 seconds
Batch ScraperAsync5–30 seconds
AI SearchSyncImmediate
Deep ResearchAsync1–5 minutes
Custom ScraperAsync3–15 seconds

Troubleshooting

CodeMeaningAction
400Invalid parametersCheck your input
401Invalid API keyVerify your API key in plugin settings
402Plan upgrade requiredUpgrade at Pricing
404Job not foundJob may have expired
429Rate limit exceededWait and retry
5xxServer errorRetry with backoff

Country codes

Proxy routing supports 207 countries. Common codes:

CodeCountry
usUnited States (default)
gbUnited Kingdom
deGermany
frFrance
jpJapan
auAustralia