Scraping API功能爬取Shopee

Shopee网站数据抓取

参数说明

参数类型描述
actorstringscraper.shopee固定值
input.actionstringshopee.product支持三种类型
  1. shopee.product 获取产品详情数据
  2. shopee.search 关键词搜索数据
  3. shopee.live 直播相关数据 | | input.url | string | URL链接 | 支持四种类型的URL链接
  4. 产品详情页的URL链接
  5. 产品详情页的API链接 (/api/v4/pdp/get_pc)
  6. 产品搜索的API链接 (/api/v4/search/search_items)
  7. 直播的API链接 (/api/v1/session/{sessionId}/more_items) |

产品详情数据

import { Scrapeless } from '@scrapeless-ai/sdk';
 
const client = new Scrapeless({
  apiKey: 'YOUR_API_KEY'
});
 
client.scraping.createTask({
   actor: "scraper.shopee",
   input: {
      action: "shopee.product",
      url: "https://shopee.tw/2312312.10228173.24803858474"
   }
}).then(async (result) => {
    console.log(result);
  })
  .catch((error) => {
    console.error('Error:', error);
  });

产品搜索数据

import { Scrapeless } from '@scrapeless-ai/sdk';
 
const client = new Scrapeless({
  apiKey: 'YOUR_API_KEY'
});
 
client.scraping.createTask({
   actor: "scraper.shopee",
   input: {
      action: "shopee.search",
      url: "https://shopee.co.th/api/v4/search/search_items?by=sales&keyword=baby%20pants&limit=30&newest=0&order=desc&page_type=search"
   }
}).then(async (result) => {
    console.log(result);
  })
  .catch((error) => {
    console.error('Error:', error);
  });

直播数据

import { Scrapeless } from '@scrapeless-ai/sdk';
 
const client = new Scrapeless({
  apiKey: 'YOUR_API_KEY'
});
 
const sessionId = "";
client.scraping.createTask({
   actor: "scraper.shopee",
   input: {
      action: "shopee.live",
      url: `https://live.shopee.co.th/api/v1/session/${sessionId}/more_items?offset=0&limit=10`
   }
}).then(async (result) => {
    console.log(result);
  })
  .catch((error) => {
    console.error('Error:', error);
  });

如何构建API链接

产品详情页API

# API可以分为三个部分
# 1. region
# 2. item_id
# 3. shop_id
 
# 支持的地区列表
# ["shopee.co.id", "shopee.vn", "shopee.co.th", "shopee.ph", "shopee.com.my", "shopee.sg", "shopee.com.co", "shopee.cl", "shopee.com.mx", "shopee.com.br", "shopee.tw"]
 
url = f"https://{region}/api/v4/pdp/get_pc?item_id={item_id}&shop_id={shop_id}"
print(url)

产品搜索API

limit = 20  # 10 20 30 40
order = "desc"
page_type = "search"
keyword = "keyword"   # 请修改此项
region = "shopee.co.id"
 
# 支持的地区列表
# ["shopee.co.id", "shopee.vn", "shopee.co.th", "shopee.ph", "shopee.com.my", "shopee.sg", "shopee.com.co", "shopee.cl", "shopee.com.mx", "shopee.com.br", "shopee.tw"] 
 
url = f"https://{region}/api/v4/search/search_items?limit={limit}&newest=0&by=sales&keyword={keyword}&order={order}&page_type={page_type}&scenario=PAGE_OTHERS&version=2"
print(url)

通过TaskId检索任务结果

import requests
 
API_KEY = ""
host = "api.scrapeless.com"
task_id = ""
 
url = f"https://{host}/api/v1/scraper/result/{task_id}"
 
headers = {
   'x-api-token': f'{API_KEY}'
}
 
response = requests.request("GET", url, headers=headers)
 
print(response.text)