Playground Sign in Start free
Resource · Anti-detection

Why your scraper looks like a bot. And how to stop.

Modern bot detection doesn’t look for a single tell — it scores every request on dozens of signals. The goal isn’t to "bypass" anything; it’s to send the same signals a real browser sends, so the score stays low and the page just loads.

Last updated Mar 2026
1

The mental model: it’s a score, not a check

Most people picture bot detection as a single yes/no check — “is this a bot?”. That’s a 2018 mental model. The 2026 model is a continuous risk score, computed per-request, from many independent signals: the TLS handshake, the IP’s reputation, header order, the JS environment, mouse behaviour, account history, time of day. Each signal nudges the score up or down.

What you experience as “blocked” is what happens when the score crosses a threshold. The work isn’t to defeat individual checks — it’s to keep the score low enough that no threshold ever fires. A scraper that looks identical to a normal browser at every layer simply gets the page, the same way a normal user would.

The reframe
We don’t “bypass” fingerprinting. We match the fingerprint a real browser sends. Anti-bot systems don’t fire on real browsers — that’s the entire point of their design.
2

The signals that actually get sampled

Every layer of the request stack leaks information. The table below is the complete list of layers we’ve seen sampled by Cloudflare, Akamai, DataDome, PerimeterX, and Imperva. The middle column is what a real Chrome on a real Mac actually sends; the right column is what a typical scraper sends instead.

Layer
Real Chrome sends
Naive scraper sends
TLS handshake
Chrome 122 sends 17 cipher suites in a specific order, GREASE values in extensions, ALPS extension, post-quantum key exchange
Python requests sends 8 ciphers in alphabetical order, no GREASE, no ALPS — JA4 hash is in a public bot-blocklist
→ Match a real browser at the TLS layer (we use a real Chromium build, not a TLS shim)
HTTP/2 frame order
Chrome sends SETTINGS → WINDOW_UPDATE → HEADERS in a specific frame priority
Most HTTP libraries send frames in a different, library-specific order — fingerprintable
→ Use a browser stack that issues frames in browser order
Header order & casing
`Host`, `Connection`, `User-Agent`, `Accept`, `Accept-Language` in fixed order with consistent casing
Python uses Title-Case, Node uses lowercase, headers come in Map insertion order
→ Use a browser engine that emits headers in real Chrome order
navigator.webdriver
Returns `undefined` in a normal Chrome
Returns `true` in unpatched Puppeteer/Playwright
→ Patch the property before page scripts execute
Canvas rendering
GPU-specific anti-aliasing produces pixel patterns that vary per device but stay consistent within a session
SwiftShader (headless software renderer) produces a known set of pixel patterns. Easy to flag.
→ Run on real GPU profiles, or add deterministic-per-session pixel noise
WebGL parameters
GPU vendor + renderer string match the OS (e.g. "ANGLE (Apple, Apple M3, OpenGL 4.1)")
Headless reports "Google Inc. SwiftShader" — a giveaway
→ Spoof renderer to a plausible real-GPU string consistent with the rest of the profile
Behavioural signals
Mouse moves with jitter and velocity, scroll has acceleration, clicks have hover-then-click timing
Instant page → click. No mouse movement. Sub-100ms reactions.
→ For high-protection sites, simulate plausible interaction timing before extraction
3

How risk scores get composed

The exact numbers vary per vendor, but the structure is universal. Every off-signal pushes the score up; every plausible signal keeps it flat. Once the score crosses a threshold (commonly ~50), you get a challenge. Past ~75, you get a hard block. Below the table is what each factor typically contributes — your own measurements will vary, but the shape is reliable.

Risk factor
Score impact
Mitigation
Datacenter IP
+35
Switch to residential proxy
Unpatched headless Chrome
+30
Use stealth-patched browser
Mismatched UA + TLS
+25
Use a real browser, not a TLS shim
Empty mouse/scroll history
+15
Add minimal interaction before extraction
No realistic Accept-Language
+10
Match country to header locale
Missing chrome.runtime
+10
Stealth mode patches it
GPU = SwiftShader
+10
Run with real GPU or spoof
The compounding effect
Two off-signals don’t add — they multiply. A datacenter IP alone might score 35; an unpatched headless alone might score 30. Together they often score 80+, because the vendor has a specific rule for “datacenter IP AND headless tells”. Fixing one signal often does more than its individual contribution suggests.
4

How sites collect the signals

The collection happens in three layers, in this order — each one is its own bouncer at the door.

1
Network layer (no JS required)

TLS fingerprint, HTTP/2 frame order, header order. The site can score you before sending a single byte of HTML. This is why some scrapers get a 403 on the very first request — the bouncer never even let them inside.

JA4 hash → reputation lookup Header-order hash IP / ASN class
2
Browser layer (after JS runs)

A challenge script runs in the page, probes navigator.webdriver, draws to canvas, queries WebGL, checks plugins, measures clock skew. Sends a signed report back to the vendor.

navigator.webdriver canvas → SHA1 WebGL.RENDERER performance.now() jitter
3
Behaviour layer (over the session)

For account-protected pages, the vendor watches mouse/scroll/click distributions across the visit. Bots tend to have unnaturally regular timing and zero entropy in pointer paths.

pointermove distribution scroll velocity profile click→submit latency
5

How to send signals that don’t flag

Three flags do most of the work. The rest of the dial is for edge cases. We rotate fingerprints per session and keep them internally consistent — Chrome on Win11/Intel stays Chrome on Win11/Intel for the entire JS environment, not Chrome’s UA with a Mac canvas hash.

1
Render in a real browser

Use a real Chromium build, not a TLS-spoofing HTTP client. js=true is the foundation; everything else is a patch on top of an actual browser process.

js=true
2
Apply stealth patches

Patches the half-dozen headless-Chrome leaks (navigator.webdriver, chrome.runtime, plugins, permissions, language consistency). Adds canvas/WebGL noise that’s deterministic per session.

stealth=true
3
Use a clean IP at the right geo

Residential proxies pull the IP-reputation factor down. Match proxy_country to the site’s primary audience — Accept-Language and timezone follow automatically.

premium_proxy=true proxy_country=US
The consistency rule
The single biggest mistake we see: mixing signals that don’t belong together. A Chrome UA on Windows with a macOS-shaped canvas hash. Or a US IP with Accept-Language: de-DE. Vendors hash the combination. Internal consistency matters more than any individual value.
6

Anti-detection API parameters

The flags below are the entire surface area. Combine based on the target’s protection level.

Parameter
Type
Default
Description
js
boolean
false
Render with a real Chromium. Required for any of the patches below to take effect — they work at the browser-API layer.
stealth
boolean
false
Apply the standard fingerprint patches: navigator.webdriver, plugins, chrome.runtime, language consistency, canvas/WebGL noise.
device
string
desktop
desktop or mobile — emits the matching viewport, UA, touch events, and devicePixelRatio. Mismatching this is one of the loudest tells.
useragent
string
auto
Custom UA string. Leave unset; auto-rotation samples from a realistic Chrome distribution that stays internally consistent across the JS environment.
accept_language
string
auto
Sets Accept-Language and navigator.language together. When unset, derived from proxy_country.
premium_proxy
boolean
false
Residential IPs. Network-layer trust score is independent of fingerprint, but they multiply: a clean fingerprint from a residential IP is the strongest signal you can send.
wait_for
string
CSS selector. Wait until the post-challenge page is hydrated before returning HTML.
7

Code examples

The default — works on most sites

For 80% of public-page targets, this is enough. No CAPTCHAs trigger because the score never gets close to threshold.

cURL · stealth + JS
curl -X GET 'https://api.example.com/scrape' \
  -H 'ApiKey: YOUR_API_KEY' \
  -G \
  --data-urlencode 'url=https://example.com' \
  --data-urlencode 'js=true' \
  --data-urlencode 'stealth=true'

The protected-site default

Add residential IPs and geo-match for sites with stricter scoring (Amazon, Google, LinkedIn, classifieds, travel).

Python · full stack
import requests

response = requests.get(
    'https://api.example.com/scrape',
    params={
        'url': 'https://heavily-protected-site.com',
        'js': 'true',
        'stealth': 'true',
        'premium_proxy': 'true',
        'proxy_country': 'US',
    },
    headers={'ApiKey': 'YOUR_API_KEY'},
)
print(response.text)

Mobile profile

Some sites serve different HTML to mobile and only protect the desktop tier — flipping device can be enough to drop the risk score below threshold.

Node · device=mobile
const params = new URLSearchParams({
  url: 'https://target.com',
  js: 'true',
  stealth: 'true',
  device: 'mobile',
});

const res = await fetch('https://api.example.com/scrape?' + params, {
  headers: { ApiKey: 'YOUR_API_KEY' },
});
console.log(await res.text());
8

FAQ

Does this work without js=true?

Partially. TLS- and header-layer signals can be matched without a real browser, and we do that automatically. But any site running a JS challenge (Turnstile, Akamai BMP, DataDome) needs a browser to execute the challenge. js=true is the supported path.

How identical are fingerprints across requests?

They’re not identical — that would be its own tell, since real browsers don’t produce the same canvas hash byte-for-byte across machines. We sample from a real distribution (Chrome on Win11/Intel, Chrome on macOS/M3, etc.) and rotate per session, while keeping each session’s fingerprint internally consistent across every JS API.

Does stealth mode add latency?

~100–300ms over a plain js=true request, mostly from running the patches before the page scripts. Most of that is amortised across the page load; you won’t notice it on a real-page scrape that already takes 1–3 seconds.

What about Cloudflare’s bot management v2?

Cloudflare’s 2025 update added more JS-environment probes (wider canvas variance check, audio context fingerprint, deeper plugin enumeration). All of them are handled by stealth=true; we update the patches when Cloudflare ships changes.

Can I bring my own fingerprint?

Yes for individual fields — set useragent, accept_language, custom headers. We don’t expose canvas/WebGL injection because the consistency invariants are easy to break by hand. If you need that, contact us.

Is this legal?

Sending a normal-looking browser fingerprint isn’t itself unlawful — it’s how every Chrome user’s browser already works. The lawfulness depends on what you do with the data: public pages, respecting robots-disallowed paths, and not violating site-specific terms is the safe lane. See the acceptable-use policy.

Ship Ujeebu tonight.

5,000 credits free. No card. Real residential proxies on the free tier.