Timeout Policy
The Universal Scraping API employs a Two-tiered timeout policy for timeouts, designed to ensure controllable request execution, system stability, and efficient resource management. By designing two independent timeout strategies, the API can provide robust performance in complex network environments and dynamic page parsing scenarios, while effectively avoiding system failures caused by resource exhaustion or long waits.
1. Global Execution Timeout
Definition: The global execution timeout is a policy that limits the cumulative execution time of all instructions in an API request.
Timeout Threshold: 180 seconds
Scope:
- All
wait_xxx
series operations (such aswait_for_selector
orwait_for_event
) in thejs_instructions
instruction set. - This threshold covers the potential waiting time during instruction execution, ensuring that long-running tasks do not indefinitely occupy system resources.
Timeout Behavior:
- When the cumulative execution time reaches 180 seconds, the system will forcibly terminate the entire API request process and return a timeout error response.
- This policy ensures a runtime limit for the API, preventing resource abuse due to complex instructions or misconfiguration.
2. Page Load Timeout
Definition: The page load timeout focuses on the time limit for the browser initialization and page resource loading phases.
Timeout Threshold: 30 seconds (fixed value)
Scope:
- The initialization process of the browser instance (such as Puppeteer or other browser drivers).
- Page resource loading, including HTML, CSS, JavaScript, and other network resources.
Timeout Behavior:
- If the URL access fails or the page resource loading time exceeds 30 seconds, the system will immediately return an error response without waiting for the global timeout.
- This policy aims to quickly identify inaccessible target pages and avoid long waits for invalid resources.
3. Timeout Priority Rules
- The page load timeout has higher priority and can interrupt request execution before the global timeout.
- When a timeout occurs during the page load phase, the system will immediately terminate the request process without entering the subsequent instruction execution phase.