Universal Scraping APIHelpTimeout Policy

Timeout Policy

The Universal Scraping API employs a Two-tiered timeout policy for timeouts, designed to ensure controllable request execution, system stability, and efficient resource management. By designing two independent timeout strategies, the API can provide robust performance in complex network environments and dynamic page parsing scenarios, while effectively avoiding system failures caused by resource exhaustion or long waits.

1. Global Execution Timeout

Definition: The global execution timeout is a policy that limits the cumulative execution time of all instructions in an API request.

Timeout Threshold: 180 seconds

Scope:

  • All wait_xxx series operations (such as wait_for_selector or wait_for_event) in the js_instructions instruction set.
  • This threshold covers the potential waiting time during instruction execution, ensuring that long-running tasks do not indefinitely occupy system resources.

Timeout Behavior:

  • When the cumulative execution time reaches 180 seconds, the system will forcibly terminate the entire API request process and return a timeout error response.
  • This policy ensures a runtime limit for the API, preventing resource abuse due to complex instructions or misconfiguration.

2. Page Load Timeout

Definition: The page load timeout focuses on the time limit for the browser initialization and page resource loading phases.

Timeout Threshold: 30 seconds (fixed value)

Scope:

  • The initialization process of the browser instance (such as Puppeteer or other browser drivers).
  • Page resource loading, including HTML, CSS, JavaScript, and other network resources.

Timeout Behavior:

  • If the URL access fails or the page resource loading time exceeds 30 seconds, the system will immediately return an error response without waiting for the global timeout.
  • This policy aims to quickly identify inaccessible target pages and avoid long waits for invalid resources.

3. Timeout Priority Rules

  • The page load timeout has higher priority and can interrupt request execution before the global timeout.
  • When a timeout occurs during the page load phase, the system will immediately terminate the request process without entering the subsequent instruction execution phase.