Playground Sign in Start free
AI Scraper

Describe what you want. Get it back.

Plain-English prompts → structured data from any page. No schema required, no DOM walking, no fragile parsing. The slowest, most flexible tool in the kit, for the long tail of weird pages.

25
credits / call
2.4s
p50 latency
page shapes covered
POST /ai-scraper
POST /ai-scraper
{
  "url": "https://news.example.com/2026/may/01/agents-eat-saas",
  "prompt": "extract title, author, publish date, and a 2-sentence summary"
}

→ 200 OK · 25 credits · 2.4s
{
  "title": "Agents eat SaaS",
  "author": "L. Reyes",
  "publish_date": "2026-05-01",
  "summary": "A look at how autonomous agents
   are dissolving the line between buyer and
   user. Implications for product pricing."
}
Live playground

See it work, before you sign up.

Drop in a URL, run a real call against the live API, and watch the JSON come back in about a second. No API key required.

Prompt = schema

Skip the JSON schema dance. Describe the fields you want in English; we'll return them named and typed. Add examples for stricter shape.

Best for the long tail

When AI Scraper's built-in schemas don't fit (odd directories, government portals, niche listings), AI Scraper handles them without per-site engineering.

Citations on demand

Pass cite: true and every field comes with the source DOM path it was lifted from. Auditable. Debuggable. No "trust the model".

Streaming responses

For long pages, set stream: true and we emit fields as they're extracted. Show partial results in your UI before the full call returns.

Multi-step extraction

Tell it to follow links, paginate, or scroll-load. AI Scraper plans the trajectory, executes steps, and returns the merged result.

Your data, your model

Bring your own OpenAI/Anthropic/Mistral key with byo_llm: true and we just handle fetching + parsing. We never train on your data.

auto_proxy + anti-bot are implicit

You don't set a proxy_type. AI Scraper escalates through the same auto_proxy ladder as the Scrape API (datacenter → residential → premium → managed-Selenium) and we only bill the tier that returns 200. CAPTCHA solving is on by default too. The flag exists if you want to pin it; you almost never need to.

Drop-in code

Copy. Paste. Ship.

import { Ujeebu } from "ujeebu";
const uj = new Ujeebu(process.env.UJEEBU_KEY);

const { data } = await uj.ai({
  url: "https://docs.weird-niche-portal.gov/proc/3801",
  prompt: "extract case number, filing date, parties, and outcome",
  cite:   true
});
from ujeebu import Ujeebu
uj = Ujeebu(api_key=os.environ["UJEEBU_KEY"])

result = uj.ai(
    url="https://docs.weird-niche-portal.gov/proc/3801",
    prompt="extract case number, filing date, parties, outcome",
    cite=True,
)
curl -X POST https://api.ujeebu.com/ai-scraper \
  -H "Authorization: Bearer $UJEEBU_KEY" \
  -d '{
    "url":    "https://docs.weird-niche-portal.gov/proc/3801",
    "prompt": "extract case number, filing date, parties, outcome",
    "cite":   true
  }'
const stream = await uj.ai({
  url: "https://...",
  prompt: "extract all 50 line items as { name, qty, price }",
  stream: true,
});

for await (const field of stream) {
  console.log(field);  // { name: "Linen Arc Lamp", qty: 2, price: 189 }
}
What people build with it

Real things real teams shipped this quarter.

When schemas fail

Government procurement portals, university course catalogues, niche industry directories: sites no schema covers. AI Scraper gets you running in 5 minutes.

Research extraction

Pass a paper PDF or news article and a domain-specific prompt ("extract every named clinical trial and its NCT ID"). Returns structured rows.

Form-state pages

Pages where you have to fill a search form to see anything useful. AI Scraper plans the interaction and extracts the post-submit data.

Mixed-content lists

Heterogeneous result pages (real estate + auctions + classifieds in one feed). The model sorts by item type and emits per-type structured rows.

Ship AI Scraper tonight.

5,000 credits free. No card. Real residential proxies on the free tier.