One API for AI search
responses.
Web-UI scraping. LLM API.
We drive ChatGPT, Perplexity, Gemini, Google AI Overview, Claude, Grok, and DeepSeek in real Chrome browsers. Captures match what users see, citations included. One Bearer token, one JSON shape.
{
"success": true,
"result": {
"provider": "aioverview",
"text": "Top EVs for 2026 include …",
"sources": [11 items],
"latencyMs": 6302
}
}- ChatGPT
- Perplexity
Gemini
- AI Overview
- Claude
- Grok
- DeepSeek
- ChatGPT
- Perplexity
Gemini
- AI Overview
- Claude
- Grok
- DeepSeek
What you get.
Uniform JSON shape
ChatGPT, Perplexity, Gemini, AI Overview, Claude, Grok, and DeepSeek all return the same response: text, sources[], html, latencyMs. Swap a provider, your parser stays the same.
Real responses, not LLM API output
We scrape the actual chat UIs — your captures include the same web-search citations end users see, not bare model completions.
Bearer auth
One header. Same key works across every endpoint. No OAuth dance, no per-provider auth schemes to wire up.
Built for batches
Send N prompts in one request, get N independent results back. Parallel fan-out — wall-clock ≈ slowest single capture, not the sum.
One request shape, every provider.
Pick a provider — the endpoint, request body, and response update.
$ curl -X POST https://api.scrapinator.dev/v1/capture/chatgpt \
-H "Authorization: Bearer sk_…" \
-H "content-type: application/json" \
-d '{"prompt":"Best podcasts for software engineers in 2026?","country":"US"}'{
"success": true,
"result": {
"provider": "chatgpt",
"text": "Three podcasts I'd recommend for software engineers in 2026 …",
"sources": [
{ "position": 1, "title": "The Changelog", "url": "https://changelog.com/…" },
{ "position": 2, "title": "Software Engineering Daily", "url": "https://softwareengineeringdaily.com/…" }
],
"latencyMs": 47213
}
}Comparison
Why not just build it yourself?
| Feature | Scrapinator | DIY scraping | Generic SERP APIs |
|---|---|---|---|
| Captures real ChatGPT / Perplexity / Gemini / Claude / Grok / DeepSeek answers | |||
| Source attributions included | partial | ||
| One API across all four AI providers | partial | ||
| No CAPTCHA solving on caller side | |||
| Maintained against provider UI changes | partial | ||
| Parallel batch fan-out | partial | ||
| Predictable per-call pricing |
FAQ
Questions, answered.
We scrape the actual chat UIs — the same ones humans use — so you see what real users see: cited sources, web-search results, AI Overview blocks. The official APIs return base-model output without retrieval or citations.
No. Every call captures a fresh response in real time. AI-search outputs change daily; caching would defeat the point of monitoring them.
Yes — AI Overview goes through our parsed SERP path. p50 ≈ 6 s, with full text + sources + bullet structure. When Google chooses not to render AI Overview for a query, we return a clean "not present" error so you can fall back gracefully.
Every ISO-3166 alpha-2 code is accepted. Country routes the underlying proxy IP geographically and biases provider-side responses to that locale. Perplexity is currently pinned to US due to their anonymous-EU gating — documented in the API reference.
Per-key concurrency limits are enforced server-side. Batch endpoints run prompts in parallel and surface per-prompt failures inside results[] without failing the batch. Selectively retry failed prompts at your own cadence.
Wired into our pipeline. CapSolver handles Cloudflare Turnstile (ChatGPT, Perplexity) and Google reCAPTCHA v2 transparently. You will never see a captcha in your response.
Yes — the html field returns the raw rendered HTML for each capture, and capturedAt is an ISO-8601 server-side timestamp. Use both together as evidence-grade records.
Sign up, copy your API key, and curl one of the examples. Five minutes including reading the docs.
Capture your first
AI response in 5 min.
One Bearer token. One JSON shape across every provider.