CDP API
Scrapeless Scraping Browser extends standard CDP (Chrome DevTools Protocol) functionality, adding a series of powerful custom functions to enhance browser automation capabilities. This document primarily covers CDP functions related to CAPTCHA handling.
CAPTCHA Solver Features
Feature Overview
Scraping Browser includes advanced CAPTCHA solution capabilities, automatically handling prevalent CAPTCHA types found on web pages.
Supported CAPTCHA Types
- reCaptcha
- Cloudflare Turnstile
- Cloudflare 5s Challenge
- AWS WAF
Event Monitoring Mechanism
Core Events
Scraping Browser provides three core events to monitor the CAPTCHA solving process:
Event Name | Description |
---|---|
Captcha.detected | CAPTCHA detected |
Captcha.solveFinished | CAPTCHA solving finished |
Captcha.solveFailed | CAPTCHA solving failed |
Event Response Data Structure
Field | Type | Description |
---|---|---|
type | string | CAPTCHA type: recaptcha turnstile |
success | boolean | Solving result |
message | string | Status message: "NOT_DETECTED" "SOLVE_FINISHED" "SOLVE_FAILED" "INVALID" |
token? | string | Token returned on success (optional) |
Implementation Example
// Listen for CAPTCHA solving events
const client = await page.createCDPSession();
client.on('Captcha.detected', (result) => {
console.log('Captcha detected:', result);
});
await new Promise((resolve, reject) => {
client.on('Captcha.solveFinished', (result) => {
if (result.success) resolve();
});
client.on('Captcha.solveFailed', () =>
reject(new Error('Captcha solve failed'))
);
setTimeout(() =>
reject(new Error('Captcha solve timeout')),
5 * 60 * 1000
);
});
Advanced Configuration API
Scraping Browser provides a series of advanced APIs for fine-grained control over the behavior of the CAPTCHA solver. The following APIs are supported:
API Name | Description |
---|---|
Captcha.setAutoSolve | Controls automatic CAPTCHA solving behavior |
Captcha.setToken | Sets the authentication token for the CAPTCHA service |
Captcha.setConfig | Configures all CAPTCHA solver parameters |
Captcha.solve | Manually triggers the CAPTCHA solving process |
Agent.click | Simulates a mouse click |
Agent.liveURL | Get live url of current session page |
Detailed API Description
1. Captcha.setConfig
Configures all parameters for the CAPTCHA solver.
await client.send('Captcha.setConfig', {
config: JSON.stringify(
{
apiKey: "your-token",
autoSolve: true,
enabledForRecaptcha: true,
enabledForRecaptchaV3: true,
enabledForTurnstile: true
}
)
});
2. Captcha.solve
Manually triggers the CAPTCHA solving process.
const { Puppeteer, createPuppeteerCDPSession } = require('@scrapeless-ai/sdk');
(async () => {
const browser = await Puppeteer.connect({
session_name: 'sdk_test',
session_ttl: 180,
proxy_country: 'US',
session_recording: true,
defaultViewport: null
});
const page = await browser.newPage();
await page.goto('https://www.scrapeless.com');
const cdpSession = await createPuppeteerCDPSession(page);
await cdpSession.solveCaptcha({ timeout: 30000 });
})();
3. Agent.click
Simulates a mouse click.
const { Puppeteer, createPuppeteerCDPSession } = require('@scrapeless-ai/sdk');
(async () => {
const browser = await Puppeteer.connect({
session_name: 'sdk_test',
session_ttl: 180,
proxy_country: 'US',
session_recording: true,
defaultViewport: null
});
const page = await browser.newPage();
await page.goto('https://www.scrapeless.com');
const cdpSession = await createPuppeteerCDPSession(page);
await cdpSession.realClick('button');
})();
4. Agent.type
Simulates a keyboard input.
const { Puppeteer, createPuppeteerCDPSession } = require('@scrapeless-ai/sdk');
(async () => {
const browser = await Puppeteer.connect({
session_name: 'sdk_test',
session_ttl: 180,
proxy_country: 'US',
session_recording: true,
defaultViewport: null
});
const page = await browser.newPage();
await page.goto('https://www.scrapeless.com');
const cdpSession = await createPuppeteerCDPSession(page);
await cdpSession.realFill('input', 'Hello, Scrapeless!');
})();
5. Agent.liveURL
Get live url of current session page.
const { Puppeteer, log as Log, createPuppeteerCDPSession } = require('@scrapeless-ai/sdk');
const logger = Log.withPrefix('puppeteer-example');
(async () => {
const browser = await Puppeteer.connect({
session_name: 'sdk_test',
session_ttl: 180,
proxy_country: 'US',
session_recording: true,
defaultViewport: null
});
const page = await browser.newPage();
await page.goto('https://www.scrapeless.com');
const cdpSession = await createPuppeteerCDPSession(page);
const { error, liveURL } = await cdpSession.liveURL();
if (error) {
logger.error('Failed to get current page URL:', error);
} else {
logger.info('Current page URL:', liveURL);
}
await browser.close();
})();
6. Captcha.imageToText
Solve Image Captcha
const { Puppeteer, createPuppeteerCDPSession } = require('@scrapeless-ai/sdk');
const browser = await Puppeteer.connect({
session_name: 'sdk_test',
session_ttl: 180,
proxy_country: 'US',
session_recording: true,
defaultViewport: null
});
const page = await browser.newPage();
await page.goto('https://www.example.com');
const cdpSession = await createPuppeteerCDPSession(page);
await cdpSession.imageToText({
imageSelector: '.captcha__image',
inputSelector: 'input[name="captcha"]',
timeout: 30000,
})