SDKNode.js SDK

Node SDK

Installation

To install the Scrapeless Node SDK, you can use npm:

npm install @scrapeless-ai/sdk

Quick Start

  1. Log in to the Scrapeless dashboard and get the API Key get-api-key.png
  2. Set the API key as an environment variable named SCRAPELESS_API_KEY or pass it as a parameter to the Scrapeless class.

Here’s a basic example:

import { Scrapeless } from '@scrapeless-ai/sdk';
 
const client = new Scrapeless({
  apiKey: 'YOUR_API_KEY' // or use SCRAPELESS_API_KEY env variable
});

Available Services

1. Scraping Browser (Browser Automation Wrapper)

The Scraping Browser module provides a high-level, unified API for browser automation, built on top of the Scrapeless Browser API. It supports both Puppeteer and Playwright, and extends the standard page object with advanced methods such as realClick, realFill, and liveURL for more human-like automation.

import { Puppeteer } from '@scrapeless-ai/sdk';
 
const browser = await Puppeteer.connect({
  session_name: 'my-session',
  session_ttl: 180,
  proxy_country: 'US'
});
const page = await browser.newPage();
 
await page.goto('https://example.com');
await page.realClick('#login-btn');
await page.realFill('#username', 'myuser');
const urlInfo = await page.liveURL();
console.log('Current page URL:', urlInfo.liveURL);
 
await browser.close();

Key Features:

  • Unified API for Puppeteer and Playwright
  • Human-like automation: realClick, realFill, liveURL
  • Easy integration with Scrapeless browser sessions

2. Browser API

Directly manage browser sessions for advanced automation scenarios.

// Create a browser session
const session = await client.browser.create({
  session_name: 'api-session',
  session_ttl: 120,
  proxy_country: 'US'
});
console.log('Browser session info:', session);

3. Scraping API

Scrape web pages and extract content in various formats.

const result = await client.scraping.scrape({
  actor: 'scraper.shopee',
  input: {
    url: 'https://shopee.tw/product/58418206/7180456348'
  }
});
 
console.log('result: ', result);

4. Deep SerpApi

Extract search engine results (e.g., Google).

const searchResults = await client.deepserp.scrape({
  actor: 'scraper.google.search',
  input: {
    q: 'nike site:www.nike.com'
  }
});
console.log('Search results:', searchResults);

5. Universal API

General-purpose scraping for flexible data extraction.

const universalResult = await client.universal.scrape({
  url: 'https://example.com',
  options: {
    javascript: true,
    screenshot: true,
    extractMetadata: true
  }
});
console.log('Universal scraping result:', universalResult);

6. Proxy API

Manage and configure proxies for your scraping and automation tasks.

// Get proxy URL
const proxy_url = await client.proxies.proxy({
  session_name: 'session_name',
  session_ttl: 180,
  proxy_country: 'US',
  session_recording: true,
  defaultViewport: null
});
console.log('Proxy URL:', proxy_url);

Error Handling

The SDK provides comprehensive error handling:

try {
  const result = await client.scraping.scrape({
    actor: 'scraper.shopee',
    input: {
      url: 'https://shopee.tw/product/58418206/7180456348'
    }
  });
} catch (error) {
  if (error instanceof ScrapelessError) {
    console.error('Scrapeless error:', error.message);
    console.error('Status code:', error.statusCode);
  } else {
    console.error('Unexpected error:', error);
  }
}

Configuration

The SDK supports various configuration options:

const client = new Scrapeless({
  apiKey: 'YOUR_API_KEY',
  timeout: 30000, // request timeout in milliseconds
  baseApiUrl: 'https://api.scrapeless.com',
  browserApiUrl: 'https://browser.scrapeless.com'
});

Environment Variables

  • SCRAPELESS_API_KEY - Your API key
  • SCRAPELESS_BASE_API_URL - Base API URL
  • SCRAPELESS_BROWSER_API_URL - Browser API URL

Best Practices

  1. API Key Security: Never hardcode your API key. Use environment variables.
  2. Error Handling: Always wrap API calls in try-catch blocks.
  3. Resource Cleanup: Always close browser connections when done.
  4. Rate Limiting: Be mindful of API rate limits.
  5. Timeout Configuration: Set appropriate timeouts for long-running operations.

Support

For support, documentation, and more examples, visit: