Your agent's plain fetches get blocked or summarized away; this returns the real page as structured data.
Turn any URL into clean, LLM-ready Markdown or extracted fields, map and crawl whole sites, and use an anti-bot proxy to unlock pages that block normal fetches and return parsed search-engine results. For ready-made scrapers targeting specific platforms, see Site Scrapers.
Fetch a single URL and return its content as clean Markdown (or HTML, raw text, screenshot, or links). Handles JavaScript rendering, anti-bot, and proxy rotation automatically; the agent just supplies a URL. Ideal for reading articles, docs pages, PDFs, or any page an agent needs to reason about. Supports per-call options like `formats`, `onlyMainContent`, `waitFor` (for dynamic pages), `includeTags` / `excludeTags`, and `actions` (click, scroll, type) before extraction.
curl -X POST "https://skill.askfaro.com/skills/generic-scrapers/run" \
-H "Authorization: Bearer faro_<your_key>" \
-H "Content-Type: application/json" \
-d '{
"intent": {
"prompt": "Scrape https://example.com/article and return the markdown"
}
}'askfaro describe generic-scrapers/scrapeAndExtractFromUrl
Install pip install askfaro-cli, then askfaro auth login.
Given a starting URL, return up to N URLs from the same site without scraping their content, a fast and cheap way to discover the shape of a site before deciding what to scrape. Use it as a precursor to `scrapeAndExtractFromUrl` for crawling-style workflows, or to find all docs/blog/product URLs on a domain. Supports `search` (filter URLs by keyword), `limit`, and `includeSubdomains`.
The base URL to start crawling from
Maximum number of links to return
Search query to use for mapping. During the Alpha phase, the 'smart' part of the search functionality is limited to 1000 search results. However, if map finds more results, there is no limit applied.
Timeout in milliseconds. There is no timeout by default.
Only return links found in the website sitemap
Ignore the website sitemap when crawling.
Include subdomains of the website
curl -X POST "https://skill.askfaro.com/skills/generic-scrapers/run" \
-H "Authorization: Bearer faro_<your_key>" \
-H "Content-Type: application/json" \
-d '{
"intent": {
"prompt": "List up to 50 URLs under https://docs.example.com"
}
}'askfaro describe generic-scrapers/mapUrls
Install pip install askfaro-cli, then askfaro auth login.
Run a search engine query through an anti-bot proxy and return parsed JSON results. Pass the full search URL including `brd_json=1` (e.g. https://www.google.com/search?q=...&brd_json=1). Supports Google, Bing, Yandex, DuckDuckGo, Baidu; set the host accordingly. Optional `country` ISO code controls proxy exit location.
Full search engine URL with query params. Append `brd_json=1` for parsed JSON results. Example: `https://www.google.com/search?q=site%3Areddit.com+best+espresso+machine&brd_json=1&hl=en&gl=us`. Supported engines include Google, Bing, Yandex, DuckDuckGo, and Baidu — set the host and any engine-specific params accordingly.
ISO 3166-1 alpha-2 country code for the proxy exit location (e.g. `us`, `gb`, `de`). Affects geo-targeted results.
curl -X POST "https://skill.askfaro.com/skills/generic-scrapers/run" \
-H "Authorization: Bearer faro_<your_key>" \
-H "Content-Type: application/json" \
-d '{
"intent": {
"prompt": "Search Google and return the results as JSON"
}
}'askfaro describe generic-scrapers/serp_search
Install pip install askfaro-cli, then askfaro auth login.
Fetch a URL through Web Unlocker; bypasses anti-bot measures, solves CAPTCHAs, renders JavaScript. Use when a regular HTTP fetch returns 403 or blocked content. Returns the page body (HTML/JSON/etc.) as raw text. Optional `country` ISO code controls the proxy exit location.
Target URL to fetch through Bright Data's Web Unlocker proxy. Bypasses anti-bot protections, solves CAPTCHAs, and renders JavaScript automatically. Returns the page body as raw text (HTML/JSON/etc.) via HTTP GET.
ISO 3166-1 alpha-2 country code for the proxy exit location (e.g. `us`, `gb`, `de`). Affects which IPs the unlocker rotates through.
curl -X POST "https://skill.askfaro.com/skills/generic-scrapers/run" \
-H "Authorization: Bearer faro_<your_key>" \
-H "Content-Type: application/json" \
-d '{
"intent": {
"prompt": "Fetch a page that keeps returning a 403 or blocked response"
}
}'askfaro describe generic-scrapers/unlock_url
Install pip install askfaro-cli, then askfaro auth login.
Three ways to get data off the web, all behind one token and one balance. Pick by what the page is:
scrapeAndExtractFromUrl turns any URL into clean Markdown, HTML, text, or structured fields (JS rendered, anti-bot handled). mapUrls discovers a site's URLs; crawlUrls, extractData, and searchAndScrape cover bulk and structured jobs.
unlock_url fetches pages that return 403 or bot challenges (CAPTCHA solving, JS rendering, residential IP rotation). serp_search returns parsed search-engine results as JSON.
50+ purpose-built scrapers for Instagram, TikTok, Facebook, LinkedIn, YouTube, Reddit, Google Maps/Search, Amazon, eBay, Booking, Airbnb, Zillow, Indeed, Glassdoor, Trustpilot, and more. Each returns clean structured items.
Pricing is per-tool and shown before you call. Failed requests are not charged. Each call is capped at $5 to prevent a surprise charge: a request predicted to cost more is declined with its estimate, so ask for fewer results per call and paginate with the continuation token to get the rest.