Universal Scraping APIFeaturesJS Render

JS Render

Universal Scraping API is a powerful web content retrieval service that supports complex web page rendering and interaction scenarios.

Refer to our API documentation for detailed content.

Basic Request Structure

{
  "actor": "unlocker.webunlocker",
  "input": {
    "url": "https://example.com",
    "js_render": false,
    "headless": false
  },
  "proxy": {
    "country": "US"
  }
}

Core Features

JavaScript Rendering

JavaScript rendering enables handling of dynamically loaded content and SPAs (Single Page Applications). Enables a complete browser environment, supporting more complex page interactions and rendering requirements.

js_render=true,we will use the browser to request.

{
  "actor": "unlocker.webunlocker",
  "input": {
    "url": "https://www.google.com/",
    "js_render": true
  },
  "proxy": {
    "country": "US"
  }
}

JavaScript Instructions

Provides an extensive set of JavaScript directives that allow you to dynamically interact with web pages.

These directives enable you to click elements, fill out forms, submit forms, or wait for specific elements to appear, providing flexibility for tasks such as clicking a “read more” button or submitting a form.

{
  "actor": "unlocker.webunlocker",
  "input": {
    "url": "https://example.com",
    "js_render": true,
    "js_instructions": [
      {
        "wait_for": [
          ".dynamic-content",
          30000
        ]
        // Wait for element
      },
      {
        "click": [
          "#load-more",
          1000
        ]
        // Click element
      },
      {
        "fill": [
          "#search-input",
          "search term"
        ]
        // Fill form
      },
      {
        "keyboard": [
          "press",
          "Enter"
        ]
        // Simulate key press
      },
      {
        "evaluate": "window.scrollTo(0, document.body.scrollHeight)"
        // Execute custom JS
      }
    ]
  }
}

Here are some common actions you can perform with JavaScript Instructions:

JavaScript Instructions Reference

InstructionSyntaxDescriptionExample
wait_for[selector, timeout]Wait for element to appear{"wait_for": [".content", 30000]}
click[selector, delay]Click element{"click": [".button", 1000]}
fill[selector, value]Fill form{"fill": ["#input", "text"]}
waitmillisecondsFixed wait time{"wait": 2000}
evaluatejavascript_codeExecute JS code{"evaluate": "console.log('test')"}
keyboard[action, value, delay?]Keyboard operationSee keyboard operations table below

Keyboard Operations

OperationSyntaxDescriptionExample
Press key["press", keyInput]Press a specific keyInput{"keyboard": ["press", "Enter"]}
Type text["type", text, delay?]Type text with optional delay{"keyboard": ["type", "Hello", 20]}
Key down["down", key]Hold down a key{"keyboard": ["down", "Shift"]}
Key up["up", key]Release a key{"keyboard": ["up", "Shift"]}

Supported Special KeyInput types: https://pptr.dev/api/puppeteer.keyinput

Response Type

You can filter values by output parameter, and the result will be returned in JSON string format. You can also specify other return types like html,markdown by response_type parameter.

Output Filters

You can filter JSON-formatted data using the outputsparameter. Once specified, the response will be fixed in JSON string format. This parameter allows you to precisely define which data types to extract from the scraped HTML, enabling efficient retrieval of only the required information. By doing this, you can reduce processing time and focus on the most relevant data for your use case.

This parameter accepts a comma-separated list of filter types and returns results in structured JSON string format. The allowable filter types include:

phone_numbers, headings, images, audios, videos, links, menus, hashtags, emails, metadata, tables, favicon.

For detailed usage, check the code below.

const axios =require('axios');
const fs =require('fs');
 
(async () => {
    // Configuration
    const url = "https://api.scrapeless.com/api/v1/unlocker/request";
    const token = "API Key";
 
    const headers = {"x-api-token": token, "Content-Type": "application/json"};
 
    const payload = {
        actor: "unlocker.webunlocker",
        input: {
            url: "https://www.example.com",
            js_render: true, // must be true
            outputs: "phone_numbers, headings, images, audios, videos, links, menus, hashtags, emails, metadata, tables, favicon" // outputs filter
        },
        proxy: {
            country: "ANY"
        }
    };
 
    try {
        const response = await axios.post(url, payload, {headers, timeout: 60000});
 
        if (response.status !== 200) {
            throw newError(`HTTP Error: ${response.status}`);
        }
 
        const data = response.data;
        if (data.code !== 200) {
            throw newError(`API Error: ${data}`);
        }
 
        const content = data.data || '';
 
        // Save and return result
        fs.writeFileSync('response.json', content, 'utf8');
console.log('✅ Success! Content saved as response.json');
 
        returnJSON.parse(content);
 
    } catch (error) {
console.error('❌ Error:', error.message);
        throw error;
    }
})()
 

Here are some examples:

Emails

Use CSS selectors and regular expressions to extract standard-format email addresses, such as example@example.com.

{
    "code": 200,
    "data": "{\"emails\":[\"market@scrapeless.com\"]}"
}
 
Phone Numbers

Use CSS selectors and regular expressions to extract phone numbers, with a focus on links containing the tel: protocol.

example:outputs=phone_numbers

{
    "code": 200,
    "data": "{ \"phone_numbers\": [ \"+1-111-111-111\" ] }"
}
 
Headings

Extract heading texts from H1 to H6 in HTML.

example:outputs=headings

{
    "code": 200,
    "data": "{\"headings\":[\"Example Domain\"]}"
}
 
Images

Extract image sources from img tags. Only return the src attribute.

example:outputs=images

{
    "code": 200,
    "data": "{\"images\":[\"https://www.scrapeless.com/_next/image?url=%2Fassets%2Fimages%2Ftoolkit%2Flight%2Fimg-2.png&w=750&q=100\"]}"
}
 
Audios

Extract audio sources from source elements within audio tags. Only return the src attribute.

example:outputs=audios

{
    "code": 200,
    "data": "{\"audios\":[\"https://example.com/audio.mp3\"]}"
}
 
Videos

Extract video sources from source elements within video tags. Only return the src attribute.

example:outputs=videos

{
    "code": 200,
    "data": "{\"videos\":[\"https://example.com/video.mp4\"]}"
}
 

Extract URLs from a tags. Only return the href attribute.

example:outputs=links

{
    "code": 200,
    "data": "{\"links\":[\"https://app.scrapeless.com/landing/guide\",\"https://www.scrapeless.com/en\",\"https://www.scrapeless.com/en/pricing\",\"https://docs.scrapeless.com/\",\"https://backend.scrapeless.com/app/api\",\"https://www.producthunt.com/posts/scrapeless-deep-serpapi\",\"https://www.g2.com/products/scrapeless/reviews\",\"https://www.trustpilot.com/review/scrapeless.com\",\"https://slashdot.org/software/p/Scrapeless/\",\"https://tekpon.com/software/scrapeless/reviews/\",\"https://www.scrapeless.com/en/product/deep-serp-api\",\"https://www.scrapeless.com/en/product/scraping-browser\",\"https://www.scrapeless.com/en/product/scraping-api\",\"https://www.scrapeless.com/en/product/universal-scraping-api\",\"https://www.scrapeless.com/en/solutions/e-commerce\",\"https://www.scrapeless.com/en/solutions/seo\",\"https://www.scrapeless.com/en/solutions/real-estate\",\"https://www.scrapeless.com/en/solutions/travel-hotel-airline\",\"https://www.scrapeless.com/en/solutions/social-media\",\"https://www.scrapeless.com/en/solutions/market-research\",\"https://www.scrapeless.com/en/blog\",\"https://www.scrapeless.com/en/blog/deep-serp-api-online\",\"https://www.scrapeless.com/en/blog/scrapeless-web-scraping-toolkit\",\"https://www.scrapeless.com/en/blog/google-shopping-scrape\",\"https://backend.scrapeless.com/app/api/v1/public/links/github\",\"https://backend.scrapeless.com/app/api/v1/public/links/youtube\",\"mailto:market@scrapeless.com\",\"https://www.scrapeless.com/en/ai-agent\",\"https://browserless.scrapeless.com/\",\"https://www.scrapeless.com/en/solutions/temu\",\"https://www.scrapeless.com/en/solutions/walmart\",\"https://www.scrapeless.com/en/solutions/shopee\",\"https://www.scrapeless.com/en/solutions/lazada\",\"https://www.scrapeless.com/en/solutions/amazon\",\"https://www.scrapeless.com/en/solutions/google-trends\",\"https://www.scrapeless.com/en/solutions/google-search\",\"https://www.scrapeless.com/en/solutions/airbnb\",\"https://www.scrapeless.com/en/solutions/scoot\",\"https://www.scrapeless.com/en/solutions/latam\",\"https://www.scrapeless.com/en/solutions/localiza\",\"https://www.scrapeless.com/en/solutions/tiktok\",\"https://www.scrapeless.com/en/solutions/instagram\",\"https://www.scrapeless.com/en/integration\",\"https://www.scrapeless.com/en/faq\",\"https://www.scrapeless.com/en/glossary\",\"https://www.scrapeless.com/en/legal/privacy-policy\",\"https://www.scrapeless.com/en/legal/terms\",\"https://www.scrapeless.com/en/legal/terms#refund-policy\",\"https://www.scrapeless.com/en/legal/check-your-data\",\"https://backend.scrapeless.com/app/api/v1/public/links/discord\"]}"
}
 

Extract menu items from li elements within menu tags.

example:outputs=menus

{
    "code": 200,
    "data": "{\"links\":[ \"Coffee\", \"Tea\", \"Milk\" ]}"
}
 
Hashtags

Extract hashtag formats using regular expressions to match typical hashtag patterns, such as #example.

example:outputs = hashtags

{
    "code": 200,
    "data": "{\"hashtags\":[\"#docsearch\",\"#search\"]}"
}
 
Metadata

Extract meta information from meta tags in the head section, returning the name and content attributes in the format name: content.

example:outputs=metadata

{
    "code": 200,
    "data": "{\"metadata\":[\"viewport: width=device-width, initial-scale=1\",\"description: Scrapeless is the best full-stack web scraping toolkit offering Scraping API, Scraping Browser\"]}"
}
 
Tables

Extract data from table elements and return the table data in JSON format, including dimensions, headings, and content.

example:outputs=tables

{
    "code": 200,
    "data": "{\"tables\":[{\"dimensions\":{\"rows\":7,\"columns\":3,\"heading\":true},\"heading\":[\"Company\",\"Contact\",\"Country\"],\"content\":[{\"Company\":\"Alfreds Futterkiste\",\"Contact\":\"Maria Anders\",\"Country\":\"Germany\"},{\"Company\":\"Centro comercial Moctezuma\",\"Contact\":\"Francisco Chang\",\"Country\":\"Mexico\"},{\"Company\":\"Ernst Handel\",\"Contact\":\"Roland Mendel\",\"Country\":\"Austria\"},{\"Company\":\"Island Trading\",\"Contact\":\"Helen Bennett\",\"Country\":\"UK\"},{\"Company\":\"Laughing Bacchus Winecellars\",\"Contact\":\"Yoshi Tannamuri\",\"Country\":\"Canada\"},{\"Company\":\"Magazzini Alimentari Riuniti\",\"Contact\":\"Giovanni Rovelli\",\"Country\":\"Italy\"}]},{\"dimensions\":{\"rows\":11,\"columns\":2,\"heading\":true},\"heading\":[\"Tag\",\"Description\"],\"content\":[{\"Tag\":\"<table>\",\"Description\":\"Defines a table\"},{\"Tag\":\"<th>\",\"Description\":\"Defines a header cell in a table\"},{\"Tag\":\"<tr>\",\"Description\":\"Defines a row in a table\"},{\"Tag\":\"<td>\",\"Description\":\"Defines a cell in a table\"},{\"Tag\":\"<caption>\",\"Description\":\"Defines a table caption\"},{\"Tag\":\"<colgroup>\",\"Description\":\"Specifies a group of one or more columns in a table for formatting\"},{\"Tag\":\"<col>\",\"Description\":\"Specifies column properties for each column within a <colgroup> element\"},{\"Tag\":\"<thead>\",\"Description\":\"Groups the header content in a table\"},{\"Tag\":\"<tbody>\",\"Description\":\"Groups the body content in a table\"},{\"Tag\":\"<tfoot>\",\"Description\":\"Groups the footer content in a table\"}]}]}"
}
 
Favicon

Extract the favicon URL from the link element in the HTML head section.

example:outputs = favicon

{
    "code": 200,
    "data": "{\"favicon\":\"https://www.scrapeless.com/favicon.ico\"}"
}
 

Other formats

In addition to filtering JSON data through the outputs parameter, you can also specify more return value types by designating the response_type parameter. The optional values are: html | plaintext | markdown | png/jpeg, with the default value being html. The details are as follows:

HTML

Used to extract the HTML content of a page, which works best for purely static pages, and returns the content in escaped HTML string format.

Add response_type=html to the request:

const axios = require('axios');
const fs = require('fs');
 
(async () => {
    const payload = {
        actor: "unlocker.webunlocker",
        input: {
            url: "https://www.example.com",
            js_render: true,
            response_type: "html"
        },
        proxy: {
            country: "ANY"
        }
    };
 
    const response = await axios.post("https://api.scrapeless.com/api/v1/unlocker/request", payload, {
        headers: {
            "x-api-token": "API Key",
            "Content-Type": "application/json"
        },
        timeout: 60000
    });
 
    if (response.data?.code === 200) {
        fs.writeFileSync('response.html', response.data.data, 'utf8');
    }
})();

Returns text content in HTML format.

{
    "code": 200,
    "data": "<!DOCTYPE html><html><head>\n    <title>Example Domain</title>\n\n    <meta charset=\"utf-8\">\n    <meta http-equiv=\"Content-type\" content=\"text/html; charset=utf-8\">\n    <meta name=\"viewport\" content=\"width=device-width, initial-scale=1\">\n    <style type=\"text/css\">\n    body {\n        background-color: #f0f0f2;\n        margin: 0;\n        padding: 0;\n        font-family: -apple-system, system-ui, BlinkMacSystemFont, \"Segoe UI\", \"Open Sans\", \"Helvetica Neue\", Helvetica, Arial, sans-serif;\n        \n    }\n    div {\n        width: 600px;\n        margin: 5em auto;\n        padding: 2em;\n        background-color: #fdfdff;\n        border-radius: 0.5em;\n        box-shadow: 2px 3px 7px 2px rgba(0,0,0,0.02);\n    }\n    a:link, a:visited {\n        color: #38488f;\n        text-decoration: none;\n    }\n    @media (max-width: 700px) {\n        div {\n            margin: 0 auto;\n            width: auto;\n        }\n    }\n    </style>    \n</head>\n\n<body>\n<div>\n    <h1>Example Domain</h1>\n    <p>This domain is for use in illustrative examples in documents. You may use this\n    domain in literature without prior coordination or asking for permission.</p>\n    <p><a href=\"https://www.iana.org/domains/example\">More information...</a></p>\n</div>\n\n\n</body></html>"
}

The example content of the HTML file after saving:

<!DOCTYPE html><html><head>
    <title>Example Domain</title>
 
    <meta charset="utf-8">
    <meta http-equiv="Content-type" content="text/html; charset=utf-8">
    <meta name="viewport" content="width=device-width, initial-scale=1">
    <style type="text/css">
    body {
        background-color: #f0f0f2;
        margin: 0;
        padding: 0;
        font-family: -apple-system, system-ui, BlinkMacSystemFont, "Segoe UI", "Open Sans", "Helvetica Neue", Helvetica, Arial, sans-serif;
 
    }
    div {
        width: 600px;
        margin: 5em auto;
        padding: 2em;
        background-color: #fdfdff;
        border-radius: 0.5em;
        box-shadow: 2px 3px 7px 2px rgba(0,0,0,0.02);
    }
    a:link, a:visited {
        color: #38488f;
        text-decoration: none;
    }
    @media (max-width: 700px) {
        div {
            margin: 0 auto;
            width: auto;
        }
    }
    </style>
</head>
 
<body>
<div>
    <h1>Example Domain</h1>
    <p>This domain is for use in illustrative examples in documents. You may use this
    domain in literature without prior coordination or asking for permission.</p>
    <p><a href="https://www.iana.org/domains/example">More information...</a></p>
</div>
 
</body></html>
Plaintext

The plain text feature is an output option that returns scraped content in plain text format instead of HTML or Markdown. This feature is highly practical when a clean, unformatted version of the content (free of any HTML tags or Markdown formatting) is required. It streamlines the content extraction process, making text processing or analysis more convenient.

Add response_type=plaintext to the request:

const axios = require('axios');
const fs = require('fs');
 
(async () => {
    const payload = {
        actor: "unlocker.webunlocker",
        input: {
            url: "https://www.example.com",
            js_render: true,
            response_type: "plaintext"
        },
        proxy: {
            country: "ANY"
        }
    };
 
    const response = await axios.post("https://api.scrapeless.com/api/v1/unlocker/request", payload, {
        headers: {
            "x-api-token": "API Key",
            "Content-Type": "application/json"
        },
        timeout: 60000
    });
 
    if (response.data?.code === 200) {
        fs.writeFileSync('response.txt', response.data.data, 'utf8');
    }
})();

Returns the plain text content of the page as a string. See the example below.

{
    "code": 200,
    "data": "Example Domain\n\nThis domain is for use in illustrative examples in documents. You may use this domain in literature without prior coordination or asking for permission.\n\nMore information..."
}

The example content of the txt file after saving:

Example Domain

This domain is for use in illustrative examples in documents. You may use this domain in literature without prior coordination or asking for permission.

More information...
Markdown

For extracting page content in Markdown format, purely static Markdown pages work best. By adding response_type=markdown to the request parameters, the Universal Scraping API will return content in Markdown format, making it more readable and easier to process.

Add response_type=markdown to the request:

const axios = require('axios');
const fs = require('fs');
 
(async () => {
    const payload = {
        actor: "unlocker.webunlocker",
        input: {
            url: "https://www.example.com",
            js_render: true,
            response_type: "markdown"
        },
        proxy: {
            country: "ANY"
        }
    };
 
    const response = await axios.post("https://api.scrapeless.com/api/v1/unlocker/request", payload, {
        headers: {
            "x-api-token": "API Key",
            "Content-Type": "application/json"
        },
        timeout: 60000
    });
 
    if (response.data?.code === 200) {
        fs.writeFileSync('response.md', response.data.data, 'utf8');
    }
})();
 

Returns text content in Markdown format.

{
    "code": 200,
    "data": "# Example Domain\n\nThis domain is for use in illustrative examples in documents. You may use this domain in literature without prior coordination or asking for permission.\n\n[More information...](https://www.iana.org/domains/example)"
}

The example content after saving a Markdown file:

# Example Domain
 
This domain is for use in illustrative examples in documents. You may use this domain in literature without prior coordination or asking for permission.
 
[More information...](https://www.iana.org/domains/example)
 
PNG/JPEG

By adding response_type=png to the request, you can capture a screenshot of the target page and return an image in PNG or JPEG format. When the response result is set to PNG or JPEG, you can use the response_image_full_page=true parameter to specify whether the returned result is a full-page screenshot. The default value of response_image_full_page parameter is false.

Add response_type=png to the request to the request:

const axios = require('axios');
const fs = require('fs');
 
(async () => {
    const payload = {
        actor: "unlocker.webunlocker",
        input: {
            url: "https://www.example.com",
            js_render: true,
            response_type: "png", // png or jpeg
            response_image_full_page: true
        },
        proxy: {
            country: "ANY"
        }
    };
 
    const response = await axios.post("https://api.scrapeless.com/api/v1/unlocker/request", payload, {
        headers: {
            "x-api-token": "API Key",
            "Content-Type": "application/json"
        },
        timeout: 60000
    });
 
    if (response.data?.code === 200) {
        fs.writeFileSync('response.png',Buffer.from(response.data.data, 'base64'));
    }
})();

Returns a base64 encoded string in PNG or JPEG format.

{
    "code": 200,
    "data": "JVBERi0xLjQKJdPr6eEKM..."
}

The example file after saving in png/jpeg:

Resource Control

Resource loading control system for optimizing performance and bandwidth usage.

{
  "actor": "unlocker.webunlocker",
  "input": {
    "url": "https://example.com",
    "js_render": true,
    "block": {
      "resources": [
        "Image",
        "Font",
        "Stylesheet",
        "Script"
      ],
      "urls": [
        // Optional, URL pattern-based blocking
        "*.analytics.com/*",
        "*/ads/*"
      ]
    }
  }
}

Complete Resource Types Reference:

Resource TypeDescriptionImpact
DocumentMain document and iframesCore page content
StylesheetCSS filesPage styling and layout
ImageImages and iconsVisual content
MediaAudio and video resourcesMultimedia content
FontWeb fontsText rendering
ScriptJavaScript filesPage functionality
TextTrackVideo subtitles and captionsMedia accessibility
XHRXMLHttpRequest callsLegacy async requests
FetchFetch API requestsModern async requests
PrefetchPrefetched resourcesPerformance optimization
EventSourceServer-sent eventsReal-time updates
WebSocketWebSocket connectionsBidirectional communication
ManifestWeb app manifestsPWA configuration
SignedExchangeSigned HTTP exchangesContent authenticity
PingPing requestsAnalytics and tracking
CSPViolationReportCSP violation reportsSecurity monitoring
PreflightCORS preflight requestsCross-origin security
OtherUnclassified resourcesMiscellaneous

Usage Example:

{
  "actor": "unlocker.webunlocker",
  "input": {
    "url": "https://example.com",
    "js_render": true,
    "block": {
      "resources": [
        "Image",
        "Font",
        "Stylesheet",
        "Script",
        "Media",
        "Ping",
        "Prefetch"
      ]
    }
  }
}

Best Practices for Resource Blocking:

  1. Performance Optimization

    • Enable js_render only when necessary
    • Use resource blocking wisely, Block non-essential resources for faster loading
    • Consider blocking Prefetch and Ping for reduced network usage
    • Keep Document and critical Script resources unblocked
  2. Bandwidth Management

    • Block Image and Media for bandwidth-intensive pages
    • Consider blocking Font to use system fonts instead
  3. Stability Enhancement

    • Implement request retry mechanisms
    • Add error handling logic
    • Use wait_for instead of fixed wait
  4. Resource Efficiency

    • Load resources on demand
    • Close unnecessary connections promptly

Note: Resource type strings are case-sensitive. Use exact matches as shown in the reference table.