The mental model: it’s a score, not a check
Most people picture bot detection as a single yes/no check — “is this a bot?”. That’s a 2018 mental model. The 2026 model is a continuous risk score, computed per-request, from many independent signals: the TLS handshake, the IP’s reputation, header order, the JS environment, mouse behaviour, account history, time of day. Each signal nudges the score up or down.
What you experience as “blocked” is what happens when the score crosses a threshold. The work isn’t to defeat individual checks — it’s to keep the score low enough that no threshold ever fires. A scraper that looks identical to a normal browser at every layer simply gets the page, the same way a normal user would.
The signals that actually get sampled
Every layer of the request stack leaks information. The table below is the complete list of layers we’ve seen sampled by Cloudflare, Akamai, DataDome, PerimeterX, and Imperva. The middle column is what a real Chrome on a real Mac actually sends; the right column is what a typical scraper sends instead.
How risk scores get composed
The exact numbers vary per vendor, but the structure is universal. Every off-signal pushes the score up; every plausible signal keeps it flat. Once the score crosses a threshold (commonly ~50), you get a challenge. Past ~75, you get a hard block. Below the table is what each factor typically contributes — your own measurements will vary, but the shape is reliable.
How sites collect the signals
The collection happens in three layers, in this order — each one is its own bouncer at the door.
TLS fingerprint, HTTP/2 frame order, header order. The site can score you before sending a single byte of HTML. This is why some scrapers get a 403 on the very first request — the bouncer never even let them inside.
JA4 hash → reputation lookup
Header-order hash
IP / ASN class
A challenge script runs in the page, probes navigator.webdriver, draws to canvas, queries WebGL, checks plugins, measures clock skew. Sends a signed report back to the vendor.
navigator.webdriver
canvas → SHA1
WebGL.RENDERER
performance.now() jitter
For account-protected pages, the vendor watches mouse/scroll/click distributions across the visit. Bots tend to have unnaturally regular timing and zero entropy in pointer paths.
pointermove distribution
scroll velocity profile
click→submit latency
How to send signals that don’t flag
Three flags do most of the work. The rest of the dial is for edge cases. We rotate fingerprints per session and keep them internally consistent — Chrome on Win11/Intel stays Chrome on Win11/Intel for the entire JS environment, not Chrome’s UA with a Mac canvas hash.
Use a real Chromium build, not a TLS-spoofing HTTP client. js=true is the foundation; everything else is a patch on top of an actual browser process.
js=true
Patches the half-dozen headless-Chrome leaks (navigator.webdriver, chrome.runtime, plugins, permissions, language consistency). Adds canvas/WebGL noise that’s deterministic per session.
stealth=true
Residential proxies pull the IP-reputation factor down. Match proxy_country to the site’s primary audience — Accept-Language and timezone follow automatically.
premium_proxy=true
proxy_country=US
Accept-Language: de-DE. Vendors hash the combination. Internal consistency matters more than any individual value.Anti-detection API parameters
The flags below are the entire surface area. Combine based on the target’s protection level.
Code examples
The default — works on most sites
For 80% of public-page targets, this is enough. No CAPTCHAs trigger because the score never gets close to threshold.
curl -X GET 'https://api.example.com/scrape' \
-H 'ApiKey: YOUR_API_KEY' \
-G \
--data-urlencode 'url=https://example.com' \
--data-urlencode 'js=true' \
--data-urlencode 'stealth=true'The protected-site default
Add residential IPs and geo-match for sites with stricter scoring (Amazon, Google, LinkedIn, classifieds, travel).
import requests
response = requests.get(
'https://api.example.com/scrape',
params={
'url': 'https://heavily-protected-site.com',
'js': 'true',
'stealth': 'true',
'premium_proxy': 'true',
'proxy_country': 'US',
},
headers={'ApiKey': 'YOUR_API_KEY'},
)
print(response.text)Mobile profile
Some sites serve different HTML to mobile and only protect the desktop tier — flipping device can be enough to drop the risk score below threshold.
const params = new URLSearchParams({
url: 'https://target.com',
js: 'true',
stealth: 'true',
device: 'mobile',
});
const res = await fetch('https://api.example.com/scrape?' + params, {
headers: { ApiKey: 'YOUR_API_KEY' },
});
console.log(await res.text());FAQ
Partially. TLS- and header-layer signals can be matched without a real browser, and we do that automatically. But any site running a JS challenge (Turnstile, Akamai BMP, DataDome) needs a browser to execute the challenge. js=true is the supported path.
They’re not identical — that would be its own tell, since real browsers don’t produce the same canvas hash byte-for-byte across machines. We sample from a real distribution (Chrome on Win11/Intel, Chrome on macOS/M3, etc.) and rotate per session, while keeping each session’s fingerprint internally consistent across every JS API.
~100–300ms over a plain js=true request, mostly from running the patches before the page scripts. Most of that is amortised across the page load; you won’t notice it on a real-page scrape that already takes 1–3 seconds.
Cloudflare’s 2025 update added more JS-environment probes (wider canvas variance check, audio context fingerprint, deeper plugin enumeration). All of them are handled by stealth=true; we update the patches when Cloudflare ships changes.
Yes for individual fields — set useragent, accept_language, custom headers. We don’t expose canvas/WebGL injection because the consistency invariants are easy to break by hand. If you need that, contact us.
Sending a normal-looking browser fingerprint isn’t itself unlawful — it’s how every Chrome user’s browser already works. The lawfulness depends on what you do with the data: public pages, respecting robots-disallowed paths, and not violating site-specific terms is the safe lane. See the acceptable-use policy.