CDP API

Scrapeless Scraping Browser extends standard CDP (Chrome DevTools Protocol) functionality, adding a series of powerful custom functions to enhance browser automation capabilities. This document primarily covers CDP functions related to CAPTCHA handling.

CAPTCHA Solver Features

Feature Overview

Scraping Browser includes advanced CAPTCHA solution capabilities, automatically handling prevalent CAPTCHA types found on web pages.

Supported CAPTCHA Types

  • reCaptcha
  • Cloudflare Turnstile
  • Cloudflare 5s Challenge
  • AWS WAF

Event Monitoring Mechanism

Core Events

Scraping Browser provides three core events to monitor the CAPTCHA solving process:

Event NameDescription
Captcha.detectedCAPTCHA detected
Captcha.solveFinishedCAPTCHA solving finished
Captcha.solveFailedCAPTCHA solving failed

Event Response Data Structure

FieldTypeDescription
typestringCAPTCHA type: recaptcha turnstile
successbooleanSolving result
messagestringStatus message: "NOT_DETECTED" "SOLVE_FINISHED" "SOLVE_FAILED" "INVALID"
token?stringToken returned on success (optional)

Implementation Example

// Listen for CAPTCHA solving events
const client = await page.createCDPSession();
 
client.on('Captcha.detected', (result) => {
  console.log('Captcha detected:', result);
});
 
await new Promise((resolve, reject) => {
  client.on('Captcha.solveFinished', (result) => {
    if (result.success) resolve();
  });
  client.on('Captcha.solveFailed', () =>
    reject(new Error('Captcha solve failed'))
  );
  setTimeout(() =>
      reject(new Error('Captcha solve timeout')),
    5 * 60 * 1000
  );
});

Advanced Configuration API

Scraping Browser provides a series of advanced APIs for fine-grained control over the behavior of the CAPTCHA solver. The following APIs are supported:

API NameDescription
Captcha.setAutoSolveControls automatic CAPTCHA solving behavior
Captcha.setTokenSets the authentication token for the CAPTCHA service
Captcha.setConfigConfigures all CAPTCHA solver parameters
Captcha.solveManually triggers the CAPTCHA solving process
Agent.clickSimulates a mouse click
Agent.liveURLGet live url of current session page

Detailed API Description

1. Captcha.setConfig

Configures all parameters for the CAPTCHA solver.

await client.send('Captcha.setConfig', {
    config: JSON.stringify(
        {
            apiKey: "your-token",
            autoSolve: true,
            enabledForRecaptcha: true,
            enabledForRecaptchaV3: true,
            enabledForTurnstile: true
        }
    )
});

2. Captcha.solve

Manually triggers the CAPTCHA solving process.

const { Puppeteer, createPuppeteerCDPSession } = require('@scrapeless-ai/sdk');
 
(async () => {
    const browser = await Puppeteer.connect({
      session_name: 'sdk_test',
      session_ttl: 180,
      proxy_country: 'US',
      session_recording: true,
      defaultViewport: null
    });
    const page = await browser.newPage();
    await page.goto('https://www.scrapeless.com');
    const cdpSession = await createPuppeteerCDPSession(page);
 
    await cdpSession.solveCaptcha({ timeout: 30000 });
})();

3. Agent.click

Simulates a mouse click.

const { Puppeteer, createPuppeteerCDPSession } = require('@scrapeless-ai/sdk');
 
(async () => {
    const browser = await Puppeteer.connect({
      session_name: 'sdk_test',
      session_ttl: 180,
      proxy_country: 'US',
      session_recording: true,
      defaultViewport: null
    });
    const page = await browser.newPage();
    await page.goto('https://www.scrapeless.com');
    const cdpSession = await createPuppeteerCDPSession(page);
 
    await cdpSession.realClick('button');
})();

4. Agent.type

Simulates a keyboard input.

const { Puppeteer, createPuppeteerCDPSession } = require('@scrapeless-ai/sdk');
 
(async () => {
    const browser = await Puppeteer.connect({
      session_name: 'sdk_test',
      session_ttl: 180,
      proxy_country: 'US',
      session_recording: true,
      defaultViewport: null
    });
    const page = await browser.newPage();
    await page.goto('https://www.scrapeless.com');
    const cdpSession = await createPuppeteerCDPSession(page);
 
    await cdpSession.realFill('input', 'Hello, Scrapeless!');
})();

5. Agent.liveURL

Get live url of current session page.

const { Puppeteer, log as Log, createPuppeteerCDPSession } = require('@scrapeless-ai/sdk');
const logger = Log.withPrefix('puppeteer-example');
 
(async () => {
    const browser = await Puppeteer.connect({
      session_name: 'sdk_test',
      session_ttl: 180,
      proxy_country: 'US',
      session_recording: true,
      defaultViewport: null
    });
    const page = await browser.newPage();
    await page.goto('https://www.scrapeless.com');
    const cdpSession = await createPuppeteerCDPSession(page);
 
    const { error, liveURL } = await cdpSession.liveURL();
    if (error) {
      logger.error('Failed to get current page URL:', error);
    } else {
      logger.info('Current page URL:', liveURL);
    }
    await browser.close();
})();

6. Captcha.imageToText

Solve Image Captcha

const { Puppeteer, createPuppeteerCDPSession } = require('@scrapeless-ai/sdk');
 
const browser = await Puppeteer.connect({
  session_name: 'sdk_test',
  session_ttl: 180,
  proxy_country: 'US',
  session_recording: true,
  defaultViewport: null
});
const page = await browser.newPage();
await page.goto('https://www.example.com');
const cdpSession = await createPuppeteerCDPSession(page);
 
await cdpSession.imageToText({
  imageSelector: '.captcha__image',
  inputSelector: 'input[name="captcha"]',
  timeout: 30000,
})