Use case · Lead extraction

Pull contact data from any site. Structured in one call.

Names, emails, phones, titles - extracted from directories, LinkedIn, conference sites, and company team pages with a single API call. Pre-built templates and CSS extract rules handle the layout differences; you handle the outreach.

Start free Try in playground

5,000 free credits · no card · failed requests not billed

The challenge

The lead extraction challenge.

Contact pages are deliberately heterogeneous. Anti-bot is real. Managed proxies, pre-built templates, and reusable extract_rules handle all of it.

Scattered contact data

Names, emails, and phones spread across page elements, nested in tables, hidden behind JS, embedded in PDFs.

Anti-scraping measures

Directories deploy CAPTCHAs, rate limits, bot detection, IP blocks - naive scrapers stop working hours in.

Inconsistent layouts

Every contact page is different. Directories, team pages, and profiles all lay fields out differently, so you want reusable templates and rules, not throwaway scripts.

Scaling extraction

Thousands of prospect pages = concurrency management, retry logic, quality monitoring. The infra ends up bigger than the scraper.

Use cases

How teams extract leads at scale.

Directory contacts

Pull listings from every directory.

Business names, addresses, phones, emails, categories from Yellow Pages, Yelp, Google Maps, and niche industry directories. Pre-built templates and extract_rules handle varied formats across directories.

Business outcomes

extract_rules: business name, address, phone, email, category
Reusable rules enforce consistent output across directories
Stealth mode + proxy rotation bypass anti-bot
Pagination handling for full listing results

LinkedIn profiles

Structured profile data without parsers.

Names, titles, companies, locations, work history. Pre-built LinkedIn templates emit clean JSON - no custom code per page type.

Business outcomes

Pre-built templates for profile / company / search pages
Browser fingerprinting + stealth avoid detection
Structured JSON: name, title, company, location
Rate limiting + session management for compliant scale

Team pages

Extract company contacts from team pages.

Submit team, about, and people pages (from a sitemap or your own URL list) and the Scrape API extracts names, titles, emails, and photos from each.

Business outcomes

Feed /about, /team, /people URLs - one per Scrape API call
extract_rules: name, title, email, photo
Discover URLs via sitemap or SERP API queries
JS rendering covers SPAs and lazy-loaded team blocks

Event attendees

Speakers, sponsors, and attendees.

Pull speaker lists, sponsor contacts, attendee directories from conference sites and event platforms. Reusable extract_rules handle every event-site layout variation.

Business outcomes

Extract speaker names + titles + companies + bios
Extract from sponsor + exhibitor pages for company contacts
Schema validation for consistent output across events
CAPTCHA solving for gated attendee directories

Sources

Extract from any site.

Pre-built templates and reusable extract_rules cover any layout - minimal per-site config.

Profile data - names, titles, company, work history. Pre-built templates with stealth browsing.

Google Maps

Business listings: names, addresses, phones, websites, ratings, categories.

Yelp

Business contact info, categories, ratings, location data from listings and search results.

Industry directories

Niche directories - Clutch, G2, Capterra, trade-association member lists.

Company websites

Team, about, and contact pages across any company site.

Conference sites

Speakers, sponsors, attendee directories from event platforms.

Professional associations

Member directories from bar associations, medical boards, trade groups.

Review platforms

Clutch, G2, Trustpilot company profiles + contact data.

Start extracting No credit card required.

How it works

Three steps to structured lead data.

1

Configure extraction request

Send the target URL to the Scrape API with extract_rules naming the contact fields. Reuse the same rules for consistent output. For multi-page directories, submit each page URL - discover them from the sitemap or via the SERP API first.

2

Extract contacts

Page renders in a stealth browser, CAPTCHAs solved automatically, extract_rules pull the contact fields. Names, emails, phones, titles, companies into structured JSON.

3

Receive structured data

Clean, schema-validated JSON. Extraction metadata. Validation warnings flag missing fields. Direct CRM import via API or CSV export.

Try it

Try lead extraction in the playground.

Drop any directory or contact page URL and see structured output.

url Try in playground

curl 'https://api.ujeebu.com/scrape' \
  -H 'ApiKey: YOUR_API_KEY' \
  -G \
  --data-urlencode 'url=https://www.yelp.com/biz/blue-bottle-coffee-san-francisco' \
  --data-urlencode 'extract_rules={"name":"h1","phone":"[href^=tel:]","email":"[href^=mailto:]","address":"address","category":".category"}'

No API key required for testing in the playground. Powered by /scrape

Features

Built for production lead extraction.

Reusable extract rules

Map CSS selectors to contact fields once, then reuse them across thousands of similar pages. Pre-built templates cover the big directories out of the box.

Email pattern recognition

Mailto links, obfuscated text ("name [at] company [dot] com"), JS-rendered emails, contact forms. Visible + hidden email patterns covered.

Phone extraction

International formats: local notation, country codes, extensions, tel: links. Selectors and tel: links capture phones across headers, footers, contact sections.

Multi-page extraction

Submit each page URL - or pull URLs from a sitemap or SERP API discovery - then run the Scrape API across the list as an async batch job.

LinkedIn templates

Pre-built templates for profile, company, and search pages. Standardised fields. Stealth mode avoids detection.

Structured JSON

extract_rules give a consistent output structure across all sources. Missing fields are simply omitted. Direct CRM import via API or CSV.

Powered by

Scrape API SERP API Try in playground

FAQ

Frequently asked.

How do I avoid rewriting selectors for every site?

Define extract_rules once - CSS selectors mapped to fields like name, email, phone - and reuse them across every page with a similar layout. Pre-built templates cover the big directories out of the box, and managed JS rendering, proxies, and CAPTCHA solving mean the rules keep working as sites change.

Can I extract from sites that block scraping?

Yes. The Scrape API includes stealth mode with browser fingerprinting, automatic CAPTCHA solving, and rotating proxies. Pages render in a real browser environment indistinguishable from a regular user. For aggressive bot detection, enable premium residential proxies. CAPTCHA solving is on by default.

How do I extract a multi-page directory?

Build a list of listing URLs - from the site’s sitemap, by paginating known URL patterns, or by running SERP API queries to discover them - then submit the list as an async batch job to the Scrape API with your extract_rules. Monitor via status endpoint; download all results on completion as JSON/CSV/ZIP.

Output format?

Structured JSON. extract_rules define exactly which fields come back (name, email, phone, title, company). Nested objects and arrays supported. Missing fields are omitted. Direct CRM/database import or programmatic processing. Async batch results can also download as ZIP.

How reliable is the extracted data?

Selector-based extraction is deterministic - the same rules return the same fields every time, with no model variability. Reliability comes down to how well your extract_rules match the page; for the big directories the pre-built templates are already tuned. Test rules in the playground, then reuse them at scale.

Credit cost?

Scrape API: 1 credit base, rising with JS rendering and premium proxy tiers. +5 credits when a CAPTCHA is solved. Async batch jobs charge per URL extracted. auto_proxy starts cheap and escalates only when needed - optimises credit spend automatically.

5,000 free credits to start.

No credit card. Failed requests cost zero.

Start free

Explore other use cases

View all →

Lead generation → Extract company info → Extract classifieds → Structured data for LLMs → Markdown for AI → Extract products →

Start extracting leads today.

Use rules-based extraction to pull structured contact data from any site in minutes.

Start using Start free trial Talk to an extraction expert

No credit card required.