Simplescraper
Skip to content

How to patch headless Chrome to avoid detection

How to patch headless Chrome to avoid detection

Updated 2026-06-24 · 6 min read

If a site loads fine in your own browser but serves your headless scraper a blank page, a CAPTCHA, or an endless challenge screen, you have run into bot detection. Default headless Chrome gives itself away with a handful of tells that standard desktop Chrome does not send, and on plenty of sites that alone is enough to get you quietly turned away.

Closing those tells addresses some browser-fingerprint checks and does not grant authorization or bypass a site's policy. The solution is to launch through a patched build of Chrome and override the few fingerprints that still stand out: the Runtime.Enable CDP leak, the suspicious WebGL software-renderer string, and the default canvas signature. It comes to about 60 lines of Node.js with rebrowser-puppeteer and a small init script, building on rebrowser-patches for the CDP fix and puppeteer-extra-plugin-stealth for the override patterns.

Key terms

  • Fingerprint. The set of values a site reads from the browser (User-Agent, WebGL strings, canvas hash) to decide whether a visitor looks like standard desktop Chrome or automation.
  • CDP. The Chrome DevTools Protocol Puppeteer uses to drive the browser; some of its calls are observable from inside the page and leak that automation is present.
  • Runtime.Enable leak. The CDP call stock Puppeteer makes to get an execution context per frame, which detectors like Cloudflare and DataDome read on the first navigation.
  • navigator.webdriver. A browser property that reports true under automation and false in normal interactive browsing, so it is overridden to false.
  • evaluateOnNewDocument. A Puppeteer method that registers code to run in every frame before the page's own scripts, so the overrides are in place before detection code reads them.
  • Canvas fingerprint. The hash a site derives from toDataURL output, byte-identical across default headless instances unless a small per-session offset is added.

Here is what the script does:

  • Launch Chrome through rebrowser-puppeteer, a drop-in replacement at the import level that neutralizes the Runtime.Enable leak Cloudflare and DataDome key on when REBROWSER_PATCHES_RUNTIME_FIX_MODE=addBinding is set in the environment.
  • Inject an init script before any page JavaScript runs, so the overrides are in place by the time the site's detection code looks.
  • Override navigator.webdriver, the WebGL UNMASKED_VENDOR and UNMASKED_RENDERER strings, and add canvas noise so each session returns a stable but non-default fingerprint.
  • Verify the result against rebrowser-bot-detector, the open test page that flags the exact leaks above.

The complete script

js
// patch-headless-chrome.mjs
import puppeteer from 'rebrowser-puppeteer'

/* The init script runs in every frame before the page's own scripts.
   Each override targets one signal a bot detector reads. */
const canvasNoise = Math.floor(Math.random() * 3) + 1

const patchFingerprint = (noise) => {
  // navigator.webdriver is `true` under automation. Normal browsing reports `false`.
  Object.defineProperty(navigator, 'webdriver', { get: () => false })

  // Headless Chrome reports a software renderer ("SwiftShader" / "Google Inc.").
  // Spoof WebGL vendor/renderer strings for a desktop Windows GPU path.
  const getParameter = WebGLRenderingContext.prototype.getParameter
  WebGLRenderingContext.prototype.getParameter = function (parameter) {
    if (parameter === 37445) return 'Google Inc. (Intel)' // UNMASKED_VENDOR_WEBGL
    if (parameter === 37446) return 'ANGLE (Intel, Intel(R) UHD Graphics 630 Direct3D11 vs_5_0 ps_5_0, D3D11)' // UNMASKED_RENDERER_WEBGL
    return getParameter.call(this, parameter)
  }

  // A pristine canvas fingerprints identically across all headless instances.
  // Add a tiny per-session offset on a copy so repeated reads stay stable.
  const toDataURL = HTMLCanvasElement.prototype.toDataURL
  HTMLCanvasElement.prototype.toDataURL = function (...args) {
    const context = this.getContext('2d')
    const { width, height } = this
    if (!context || width === 0 || height === 0) {
      return toDataURL.apply(this, args)
    }
    const image = context.getImageData(0, 0, width, height)
    for (let i = 0; i < image.data.length; i += 4) {
      image.data[i] = Math.min(255, image.data[i] + noise)
    }
    const copy = document.createElement('canvas')
    copy.width = width
    copy.height = height
    copy.getContext('2d').putImageData(image, 0, 0)
    return toDataURL.apply(copy, args)
  }
}

const browser = await puppeteer.launch({
  headless: true,
  args: [
    '--disable-blink-features=AutomationControlled', // drops the automation flag
    '--no-sandbox'
  ]
})

const page = await browser.newPage()

// desktop Chrome User-Agent; match the major version to the Chromium build you launch.
await page.setUserAgent(
  'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 ' +
  '(KHTML, like Gecko) Chrome/126.0.0.0 Safari/537.36'
)

// Run the patch before the page loads, so it covers the first navigation.
await page.evaluateOnNewDocument(patchFingerprint, canvasNoise)

await page.goto('https://bot-detector.rebrowser.net/', { waitUntil: 'networkidle2' })

const report = await page.evaluate(() => document.body.innerText)
console.log(report)

await browser.close()
bash
npm install rebrowser-puppeteer
REBROWSER_PATCHES_RUNTIME_FIX_MODE=addBinding node patch-headless-chrome.mjs

What each step does

Launch through rebrowser-puppeteer. The import is the only code change from stock Puppeteer. The fork ships the same API and applies the Runtime.Enable patch when REBROWSER_PATCHES_RUNTIME_FIX_MODE=addBinding is set, so your existing page.goto and page.evaluate calls work unchanged while the CDP leak is closed.

Pass the automation-control flag. The Blink automation-control flag stops Chrome from advertising itself as automated at the Blink layer, which is a separate signal from navigator.webdriver. Without it, some detectors flag the browser before any of your JavaScript runs.

Set a desktop Chrome User-Agent. Headless Chrome's default UA contains the literal token HeadlessChrome, which is the simplest possible block. Replace it with a current desktop Chrome string and match the major version to the Chrome you are actually running.

Inject the patch with evaluateOnNewDocument. This registers the override to run before the page's own scripts on every navigation and every new frame. Running the same code with page.evaluate after goto is too late, because the detection script has already read the unpatched values.

Override WebGL and canvas. The WebGL parameter constants 37445 and 37446 are UNMASKED_VENDOR_WEBGL and UNMASKED_RENDERER_WEBGL. Returning Windows ANGLE GPU strings hides the software renderer. The canvas patch adds a small, consistent per-session offset to pixel values on a copy of the canvas, so the fingerprint differs from the default headless value but stays stable within the session.

Gotchas

  • The patch runs too late to cover the first page.

    • Issue: calling page.evaluate(patchFingerprint) after page.goto injects the overrides only after the detection script has already read navigator.webdriver and the raw canvas.
    • Fix: register the patch with page.evaluateOnNewDocument(patchFingerprint) before goto, so it is in place for the first navigation and every subsequent frame.
  • rebrowser still needs its CDP fix turned on.

    • Issue: installing rebrowser-puppeteer is not enough on its own; the Runtime.Enable patch has a mode that defaults to a value which can leave the leak partly open depending on version.
    • Fix: set REBROWSER_PATCHES_RUNTIME_FIX_MODE=addBinding in the environment before launch, then confirm with rebrowser-bot-detector that the Runtime.Enable check passes.
  • A randomized canvas fingerprint is its own signal.

    • Issue: adding fresh random noise on every toDataURL call makes the fingerprint change between two reads of the same canvas, which normal browsers do not do and which detectors test for.
    • Fix: seed one small offset per browser launch and apply it to a copy of the canvas, so the hash is consistent within a session and only differs from the default headless value.
  • WebGL vendor and User-Agent platform disagree.

    • Issue: spoofing a macOS-only renderer while sending a Windows User-Agent, or an Apple GPU with a Windows UA, is an internally inconsistent fingerprint that scores worse than the unpatched default.
    • Fix: pick one platform and keep the User-Agent, WebGL strings, navigator.platform, and timezone consistent with it.
  • The browser launches headful-looking but the IP gives it away.

    • Issue: a clean fingerprint from a datacenter IP range still trips Cloudflare and DataDome, because the network reputation is checked independently of the browser fingerprint.
    • Fix: route the browser through a residential proxy and authenticate with page.authenticate when the target allows that network path.
  • rebrowser tracks upstream Puppeteer but lags it.

    • Issue: pinning rebrowser-puppeteer against a Chrome version newer than the fork has caught up to causes a launch mismatch or a stale automation signal.
    • Fix: install the rebrowser version whose major tracks your target Puppeteer release, and re-run the detector after any upgrade rather than assuming the patch still holds.

Use this when

You run authorized, permitted data collection: your own sites, an API you are licensed to use, or a target whose terms and robots rules allow it, and a bot-detection layer is blocking a request you are entitled to make.

Respect the site's robots.txt, terms of service, and rate limits before reaching for any of this.

Skip this when

Skip it when a plain fetch already returns the page (you do not need a browser at all), when the data has an official API or export (use that), when only a single CAPTCHA stands in the way (solve that challenge directly rather than patching the whole browser), and when the block is a hard IP ban rather than a fingerprint check (rotate the network path instead).

Skip the code, just get the data

Simplescraper turns any website into structured data in seconds.