How to intercept and read network requests in Puppeteer

Updated 2026-06-25 · 6 min read

If you're scraping a page where the data shows up in the browser but never in the HTML you get back, you have probably already opened the Network tab and watched the real payload arrive as a separate XHR or fetch call returning JSON. The page renders from that response, not from the markup, so parsing the DOM gets you a loading spinner and selectors that resolve to nothing.

The fix is to listen to the browser's own traffic instead of the rendered output. We'll build a small script that launches headless Chrome and watches every outgoing request so you can see the method, URL, and type of each call, reads the JSON body of the XHR and fetch responses that feed a single-page app as they arrive, and keeps a Chrome DevTools Protocol session ready for the response bodies the high-level events do not hand you directly. That gives you the same JSON the page consumes, in about 40 lines of Node.js with one library.

The complete script

// intercept-network.mjs
import puppeteer from 'puppeteer'

const targetUrl = 'https://httpbin.org/anything'

const browser = await puppeteer.launch({ headless: true })
const page = await browser.newPage()

// collected response payloads, keyed by request URL.
const captured = []

// fires for every outgoing request. read URL, method, type, and post body here.
page.on('request', request => {
  console.log(`[request] ${request.method()} ${request.resourceType()} ${request.url()}`)
})

// fires when a response arrives. read the body for the request types you care about.
page.on('response', async response => {
  const request = response.request()
  const type = request.resourceType()

  // the data feeding a single-page app almost always arrives as xhr or fetch.
  if (type === 'xhr' || type === 'fetch') {
    const contentType = response.headers()['content-type'] || ''

    if (contentType.includes('application/json')) {
      try {
        const body = await response.json()
        captured.push({ url: response.url(), status: response.status(), body })
        console.log(`[json] ${response.status()} ${response.url()}`)
      } catch {
        // a redirect or a body already consumed elsewhere throws here. skip it.
      }
    }
  }
})

await page.goto(targetUrl, { waitUntil: 'networkidle0' })

console.log(`Captured ${captured.length} JSON responses`)
console.log(JSON.stringify(captured[0]?.body, null, 2))

await browser.close()

bash

npm install puppeteer
node intercept-network.mjs

How it works

Launch with headless: true. The default headless mode runs the same Chromium build that Puppeteer drives in headed mode, so the requests the page fires are the requests a browser fires. The network listeners attach to the page, not the launch, so the order here is launch, new page, then wire the listeners before you navigate; wire them after page.goto returns and the early requests fired during navigation never reach your handler, leaving captured short or empty.

Listen on page.on('request'). This event fires once per outgoing request, before the response comes back. The request object carries method(), url(), resourceType(), headers(), and postData(), which is enough to reconstruct the call as a standalone fetch later. This listener does not block the request, because the script does not call page.setRequestInterception(true); it observes traffic rather than rewriting it. Only turn interception on when you need to rewrite or block requests, and when you do, make every path in the handler end in exactly one of request.continue(), request.abort(), or request.respond(), or the unanswered request hangs page.goto until it times out.

Listen on page.on('response') and read the body. When a response arrives, response.json() parses a JSON body and response.text() returns the raw string. The filter on resourceType() narrows the flood of image, stylesheet, and font responses down to the xhr and fetch calls that carry a single-page app's data. The try/catch matters because response.json() throws on a redirect or a response whose body has already been consumed; for a body served from the disk cache or otherwise unavailable to this event, open a CDP session and call Network.getResponseBody against the request id, which reads straight from the browser's response buffer: const cdp = await page.createCDPSession(); await cdp.send('Network.enable'). Reading inside this high-volume stream can also catch a chunked or streamed body before it finishes buffering and return a truncated object, so for those grab the finished response with page.waitForResponse and read it once.

Wait for networkidle0. Passing waitUntil: 'networkidle0' to page.goto resolves once there have been no network connections for 500ms, which gives the deferred XHR and fetch calls time to fire and land in captured. Without it, goto resolves on the initial document load and the script closes the browser before the data requests run. A page that keeps a connection busy with a WebSocket, analytics heartbeat, or polling timer never reaches that idle window, so switch to waitUntil: 'networkidle2', which tolerates up to two open connections, or wait for the one call you want with page.waitForResponse(res => res.url().includes('/api/data')).

Use this when

You want the raw JSON a single-page app loads behind its UI, you are reverse-engineering a site's internal API to call it directly, or you are auditing which third-party endpoints a page contacts and what it sends them.

Skip this when

The data is already in the served HTML (parse it with cheerio instead); you need to block or rewrite requests rather than read them (enable page.setRequestInterception(true) and act in the handler); the endpoint is reachable without a browser (call it directly with fetch and the headers you captured); or you need a full session recording for replay (capture a HAR with a CDP listener and a HAR writer).

How to intercept and read network requests in Puppeteer ​

The complete script ​

How it works ​

Related guides ​

Skip the code, just get the data Simplescraper turns any website into structured data in seconds.

How to intercept and read network requests in Puppeteer

The complete script

How it works

Related guides

Skip the code, just get the data
Simplescraper turns any website into structured data in seconds.