Supported CAPTCHAs

reCaptcha

The Scrapeless crawler browser only assists you in automatically solving reCaptchaV2; subsequent operations need to be implemented by yourself.

Cloudflare

  • Cloudflare Turnstile
  • Cloudflare Challenge

The Scrapeless crawler browser only assists you in automatically solving Turnstile or Challenge; subsequent operations need to be implemented by yourself. For detailed practices on handling Cloudflare challenges (including obtaining cf_clearance), please refer to: https://www.scrapeless.com/en/blog/cloudflare-challenge-bypass

Solution Example

When we connect to the browser to access the target site, Scrapeless will automatically solve the CAPTCHA. However, we need to ensure that the CAPTCHA has been successfully solved. This is a simple example: This example accesses the target site and confirms that the CAPTCHA has been successfully solved by listening to the Captcha.solveFinished CDP event. Finally, it will take a screenshot of the page for verification.

This example defines two main methods:

  • addCaptchaListener: Used to listen for CAPTCHA events in the browser session
  • onCaptchaFinished: Used to wait for CAPTCHA solving to complete

Supported CAPTCHA list

  • reCaptcha v2
  • Cloudflare Turnstile
  • Cloudflare 5s Challenge
  • AWS Challenge
import puppeteer from "puppeteer-core";
import EventEmitter from 'events';
const emitter = new EventEmitter()
const scrapelessUrl = 'wss://browser.scrapeless.com/browser?token=your_api_key&session_ttl=180&proxy_country=ANY';
 
export async function example(url) {
  const browser = await puppeteer.connect({
    browserWSEndpoint: scrapelessUrl,
    defaultViewport: null
  });
  console.log("Verbonden met Scrapeless browser");
  try {
    const page = await browser.newPage();
    // Listen for captcha events
    console.debug("addCaptchaListener: Start listening for captcha events");
    await addCaptchaListener(page);
    console.log("Navigated to URL:", url);
    await page.goto(url, { waitUntil: "domcontentloaded", timeout: 30000 });
    console.log("onCaptchaFinished: Waiting for captcha solving to finish...");
    await onCaptchaFinished()
    // Screenshot for debugging
    console.debug("Taking screenshot of the final page...");
    await page.screenshot({ path: 'screenshot.png', fullPage: true });
  } catch (error) {
    console.error(error);
  } finally {
    await browser.close();
    console.log("Browser closed");
  }
}
 
async function addCaptchaListener(page) {
  const client = await page.createCDPSession();
  client.on("Captcha.detected", (msg) => {
    console.debug("Captcha.detected: ", msg);
  });
  client.on("Captcha.solveFinished", async (msg) => {
    console.debug("Captcha.solveFinished: ", msg);
    emitter.emit("Captcha.solveFinished", msg);
    client.removeAllListeners()
  });
}
 
async function onCaptchaFinished(timeout = 60_000) {
  return Promise.race([
    new Promise((resolve) => {
      emitter.on("Captcha.solveFinished", (msg) => {
        resolve(msg);
      });
    }),
    new Promise((_, reject) => setTimeout(() => reject('Timeout'), timeout))
  ])
}

reCaptcha Example

Call the example code method to verify automatic reCaptcha solving.

 example('https://recaptcha-demo.appspot.com/recaptcha-v2-checkbox-explicit.php');

Cloudflare Turnstile Example

Call the example code method to verify automatic Cloudflare Turnstile solving.

 example('https://www.scrapingcourse.com/login/cf-turnstile');

Successful solving of Cloudflare Turnstile can be confirmed not only by listening to the Captcha.solveFinished CDP event, but also by listening to window.turnstile.getResponse(). This is a complete example:

import puppeteer from "puppeteer-core";
const scrapelessUrl = 'wss://browser.scrapeless.com/browser?token=your_api_key&session_ttl=180&proxy_country=ANY';
 
export async function turnstileExample(url) {
  const browser = await puppeteer.connect({
    browserWSEndpoint: scrapelessUrl,
    defaultViewport: null
  });
  console.log("Verbonden met Scrapeless browser");
  try {
    const page = await browser.newPage();
    console.log("Navigated to URL:", url);
    await page.goto(url, { waitUntil: "domcontentloaded", timeout: 30000 });
    console.log("onCaptchaFinished: Waiting for captcha solving to finish...");
    await waitTurnstile(page)
    // Screenshot for debugging
    console.debug("Taking screenshot of the final page...");
    await page.screenshot({ path: 'screenshot.png', fullPage: true });
  } catch (error) {
    console.error(error);
  } finally {
    await browser.close();
    console.log("Browser closed");
  }
}
 
async function waitTurnstile(page) {
    await page.waitForFunction(() => {
        return window.turnstile && window.turnstile.getResponse();
    });
    const token = await page.evaluate(() => {
        return window.turnstile.getResponse();
    });
    console.log("Cloudflare Turnstile token:", token);
}
 
turnstileExample('https://www.scrapingcourse.com/login/cf-turnstile');

Cloudflare Challenge Example

Cloudflare Challenge is special because sometimes it will not trigger a Cloudflare Challenge, and the method of confirming successful resolution by listening to CDP events will time out. Therefore, waiting for the appearance of elements on the page after resolution is a more stable method. This is a complete example:

import puppeteer from "puppeteer-core";
const scrapelessUrl = 'wss://browser.scrapeless.com/browser?token=your_api_key&session_ttl=180&proxy_country=ANY';
 
export async function challengeExample(url) {
  const browser = await puppeteer.connect({
    browserWSEndpoint: scrapelessUrl,
    defaultViewport: null
  });
  console.log("Verbonden met Scrapeless browser");
  try {
    const page = await browser.newPage();
    console.log("Navigated to URL:", url);
    await page.goto(url, { waitUntil: "domcontentloaded", timeout: 30000 });
    console.log("onCaptchaFinished: Waiting for captcha solving to finish...");
    await waitChallenge(page, 'main.page-content .challenge-info')
    // Screenshot for debugging
    console.debug("Taking screenshot of the final page...");
    await page.screenshot({ path: 'screenshot.png', fullPage: true });
  } catch (error) {
    console.error(error);
  } finally {
    await browser.close();
    console.log("Browser closed");
  }
}
 
async function waitChallenge(page, selector) {
    await page.waitForSelector(selector);
    console.log("Cloudflare Challenge completed");
}
 
challengeExample('https://www.scrapingcourse.com/cloudflare-challenge');