Simplescraper
Skip to content

How to solve a Turnstile / reCAPTCHA challenge programmatically

How to solve a Turnstile / reCAPTCHA challenge programmatically

Updated 2026-06-25 · 6 min read

If your Puppeteer script reaches a form or a gate on a site you are authorized to automate and stalls on a Cloudflare Turnstile widget or a reCAPTCHA v2 checkbox, the page is waiting for a token your headless browser cannot produce on its own. The widget loads, the request to submit never completes, and the script either times out or loops on the same screen. Solving the puzzle in the browser is the hard path; the token the page actually wants can be obtained out of band.

The solution is to read the challenge's sitekey off the page, hand the sitekey and the page URL to a solver service over its HTTP API, poll until the service returns a token, then write that token into the page's hidden response field and continue the flow. It comes to about 80 lines of Node.js with Puppeteer and the native fetch, calling a solver API directly so there is no SDK to pin; the canonical reference client for the same endpoints is 2captcha-python.

Key terms

  • Sitekey. The public widget identifier the page exposes in the data-sitekey attribute of the Turnstile or reCAPTCHA element, which the solver service needs to fetch a matching token.
  • Token. The signed string the challenge produces when solved, posted back to the site to prove the visitor passed; for reCAPTCHA v2 it lands in g-recaptcha-response, for Turnstile in cf-turnstile-response.
  • Solver service. A paid API such as 2Captcha or CapMonster that accepts a sitekey plus URL and returns a token, run as a createTask then getTaskResult poll.
  • Proxyless task. A solver task type (TurnstileTaskProxyless, RecaptchaV2TaskProxyless) where the service uses its own network, so you do not pass it a proxy.
  • Callback. The JavaScript function the widget calls with the token once it has one, which some pages rely on instead of reading the hidden field directly.

Here is what the script does:

  • Launch Puppeteer and navigate to the page that carries the challenge widget.
  • Read the sitekey out of the Turnstile or reCAPTCHA element's data-sitekey attribute.
  • Send the sitekey and page URL to the solver service with a createTask call, then poll getTaskResult until the token comes back.
  • Write the token into the page's hidden response field, then submit the form.

The complete script

js
// solve-challenge.mjs
import puppeteer from 'puppeteer'

/* Solver credentials and target come from the environment, never hardcoded.
   SOLVER_KEY is your 2Captcha (or compatible) API key. */
const SOLVER_KEY = process.env.SOLVER_KEY
const TARGET_URL = process.env.TARGET_URL
const API = 'https://api.2captcha.com'

/* Submit one task and poll getTaskResult until it is ready. The service answers
   "processing" while it works, so this retries up to ~40 times at 5s apart. */
const solve = async (task) => {
  const created = await fetch(`${API}/createTask`, {
    method: 'POST',
    headers: { 'content-type': 'application/json' },
    body: JSON.stringify({ clientKey: SOLVER_KEY, task })
  }).then(r => r.json())

  if (created.errorId !== 0) throw new Error(`createTask: ${created.errorDescription}`)
  const taskId = created.taskId

  for (let attempt = 0; attempt < 40; attempt++) {
    await new Promise(r => setTimeout(r, 5000))
    const result = await fetch(`${API}/getTaskResult`, {
      method: 'POST',
      headers: { 'content-type': 'application/json' },
      body: JSON.stringify({ clientKey: SOLVER_KEY, taskId })
    }).then(r => r.json())

    if (result.errorId !== 0) throw new Error(`getTaskResult: ${result.errorDescription}`)
    if (result.status === 'ready') return result.solution
  }
  throw new Error('solver timed out')
}

const browser = await puppeteer.launch({ headless: true })
const page = await browser.newPage()
await page.goto(TARGET_URL, { waitUntil: 'networkidle2' })

/* Read the sitekey from whichever widget is present. Turnstile uses the
   .cf-turnstile container; reCAPTCHA v2 uses .g-recaptcha. Both expose it as
   data-sitekey. The widget type decides the solver task type and the field name. */
const challenge = await page.evaluate(() => {
  const turnstile = document.querySelector('.cf-turnstile')
  if (turnstile) return { kind: 'turnstile', sitekey: turnstile.dataset.sitekey }
  const recaptcha = document.querySelector('.g-recaptcha')
  if (recaptcha) return { kind: 'recaptcha', sitekey: recaptcha.dataset.sitekey }
  return null
})

if (!challenge) throw new Error('no Turnstile or reCAPTCHA widget found on page')

/* Proxyless task: the solver fetches the token over its own network. The two
   widget kinds map to two task types and two response field names. */
const taskMap = {
  turnstile: { type: 'TurnstileTaskProxyless', field: 'cf-turnstile-response' },
  recaptcha: { type: 'RecaptchaV2TaskProxyless', field: 'g-recaptcha-response' }
}
const { type, field } = taskMap[challenge.kind]

const solution = await solve({
  type,
  websiteURL: TARGET_URL,
  websiteKey: challenge.sitekey
})

/* Turnstile returns solution.token; reCAPTCHA returns solution.gRecaptchaResponse. */
const token = solution.token ?? solution.gRecaptchaResponse

/* Write the token into the hidden field the page submits, then fire any
   widget callback so a page that waits on the callback proceeds too. */
await page.evaluate((field, token) => {
  let input = document.querySelector(`[name="${field}"]`)
  if (!input) {
    input = document.createElement('textarea')
    input.name = field
    input.style.display = 'none'
    document.forms[0]?.appendChild(input)
  }
  input.value = token
}, field, token)

await Promise.all([
  page.waitForNavigation({ waitUntil: 'networkidle2' }).catch(() => {}),
  page.evaluate(() => document.forms[0]?.submit())
])

console.log('Submitted with token:', token.slice(0, 24) + '...')
await browser.close()
bash
npm install puppeteer
SOLVER_KEY=your_key TARGET_URL=https://example.com/login node solve-challenge.mjs

What each step does

Read credentials and target from the environment. SOLVER_KEY and TARGET_URL come from process.env, so the API key never lands in the file or the shell history of anyone reading the page. The script throws on a missing widget rather than submitting an empty token.

Launch headless and load the page. puppeteer.launch({ headless: true }) starts Chrome with no window, and waitUntil: 'networkidle2' holds until the challenge script has had time to inject its widget. Reading the sitekey before the widget mounts returns null.

Find the sitekey and pick the task type. The page.evaluate block checks for the Turnstile container first, then the reCAPTCHA one, and returns the data-sitekey from whichever is present. The widget kind selects the solver type and the response field name through a lookup table, so adding a third widget is one more map entry rather than a branch.

Run createTask, then poll getTaskResult. createTask registers the sitekey and URL and returns a taskId. The service answers status: 'processing' while it works, so the loop waits five seconds between reads and gives up after forty attempts. A non-zero errorId surfaces the service's own error string instead of failing silently.

Inject the token and submit. The token goes into the hidden response field by name, creating the field if the page has not rendered it yet, then document.forms[0].submit() posts it. The ?? solution.gRecaptchaResponse covers the two property names the solver returns, token for Turnstile and gRecaptchaResponse for reCAPTCHA.

Gotchas

  • The sitekey read runs before the widget mounts.

    • Issue: querying .cf-turnstile or .g-recaptcha right after goto returns null on pages that inject the widget from a later script, so challenge is empty and the script throws.
    • Fix: wait for the element with await page.waitForSelector('.cf-turnstile, .g-recaptcha', { timeout: 15000 }) before the evaluate, so the read happens after the widget exists.
  • The token expires before you use it.

    • Issue: a Turnstile or reCAPTCHA token is valid for roughly two minutes, and a slow solve plus extra page steps can push submission past that window, so the site rejects a token that was good when it arrived.
    • Fix: submit the form in the same run right after injection, and if the page does other work first, request the token last rather than up front.
  • The page waits on a callback, not the hidden field.

    • Issue: some widgets are configured with a data-callback, and the page only enables submit when that function fires, so writing the field alone leaves the button disabled.
    • Fix: read the callback name from the widget's data-callback attribute and invoke window[name](token) inside page.evaluate after setting the field.
  • A datacenter IP gets a harder challenge than the solver was given.

    • Issue: the proxyless task solves against the solver's network, but your browser submits the token from your IP, and a mismatch in network reputation can make the site re-challenge or reject the token.
    • Fix: use the proxied task type (TurnstileTask, RecaptchaV2Task) and pass the same residential proxy your browser uses, so the solve and the submission share a network path.
  • reCAPTCHA v3 returns a score, not a checkbox token.

    • Issue: this flow targets the v2 checkbox and Turnstile; pointing it at a reCAPTCHA v3 page sends the wrong task type and the returned token scores low because it carries no real interaction history.
    • Fix: detect v3 by its grecaptcha.execute action call and switch to the RecaptchaV3TaskProxyless task with the page's action name, accepting that low scores still get blocked.
  • The solver succeeds but the form selector is wrong.

    • Issue: document.forms[0] grabs the first form on the page, which on a page with a search box or newsletter form is not the one carrying the challenge, so the token is appended to the wrong form.
    • Fix: target the form explicitly, for example document.querySelector('form#login'), and append the token field to that node rather than to forms[0].

Use this when

You are authorized to automate the target (your own property, a client engagement with permission, or a site whose terms allow it) and a Turnstile or reCAPTCHA v2 gate stands between your script and a request you are entitled to make. Confirm first that the site's terms permit automated access; a site that uses these challenges specifically to forbid automation is one where solving the challenge may breach those terms, and respect its robots.txt and rate limits regardless.

Skip this when

The data has an official API or export, in which case use that and skip the browser entirely; the challenge is reCAPTCHA v3 or Enterprise, in which case switch to the matching scored task type rather than this v2 flow; the gate is a one-off login you can perform by hand, in which case capture the session cookie once and reuse it; or the target's terms prohibit automated access, in which case do not scrape it at all.

Skip the code, just get the data

Simplescraper turns any website into structured data in seconds.