Scrape API User Guide
Table of Contents
This document is intended for developers to facilitate efficient and convenient data integration and rapid API connection.
API Name
Amazon Page Scraping API
API Description
This API is used to scrape any page from the Amazon front end and supports scraping with a specified postal code to obtain page data consistent with what Amazon presents to consumers. The API returns data asynchronously, so developers need to deploy a simple HTTP service to receive the data. We will push the scraping results to you via an HTTP request. At the end of this document, you will find the code for a Java Spring Boot-based receiving service for reference.
Request URL
http://scrape.pangolinfo.com/api/task/receive/v1
Request Method
POST
Parameters
Query Parameters
Parameter Name | Parameter Type | Description |
---|---|---|
token | String | User authentication information, please contact the administrator to obtain it |
Request Body
{
"url": "https://www.amazon.com/s?k=baby", // Required, the Amazon page URL to scrape
"callbackUrl": "http://xxx/xxx", // Required, the developer's service address for receiving data (the page data will be pushed to this address upon successful scraping)
"proxySession": "0502f0d18e034e72bd14b026a3964f54", // 32-character UUID for specifying a particular IP for scraping, IP can be maintained for the day and expires after midnight
"callbackHeaders": "k1:v1|k2:v2", // Optional, data to be included in the request headers during the callback, ensure value is correctly encoded
"bizContext": { // Optional
"zipcode": "90001" // Amazon postal code information (optional), the example is the postal code for Los Angeles, USA
}
}
Note: The following postal codes are currently supported:
United States:
"10041", "90001", "60601", "84104"
Germany:
"80331", "10115", "20095", "60306"
United Kingdom:
"W1S 3AS", "EH15 1LR", "M13 9PL", "M2 5BQ"
Japan:
"100-0004", "060-8588", "163-8001", "900-8570"
France:
"75000", "69001", "06000", "13000"
Italy:
"20019", "50121", "00042", "30100"
Spain:
"41001", "28001", "08001", "46001"
Canada:
"M4C 4Y4", "V6E 1N2", "H3G 2K8", "T2R 0G5"
Response Parameters:
{
"code": 0, // System status code
"message": "ok",
"data": {
"data": "57b049c3fdf24e309043f28139b44d05", // Returns the spider task ID; this ID along with the page data will be pushed to the receiving service upon successful scraping
"bizCode": 0, // Business status code
"bizMsg": "ok" // Business status message
}
}
Error Codes
1001
- Meaning: Parameter is empty / Parameter is incorrect
- Solution: Check if the request parameters are correct
1004
- Meaning: Access denied / Token is incorrect / Exceeded trial limit
- Solution: Please check the Token
Example Request
1. Curl Example
# Request
curl --location 'http://scrape.pangolinfo.com/api/task/receive/v1?token=xxx' \
--header 'Content-Type: application/json' \
--data '{
"url": "https://www.amazon.com/s?k=baby",
"callbackUrl": "http://***.***.***.***/callback/data",
"bizContext": {
"zipcode": "90001"
}
}'
# Response
{
"code": 0, // System status code
"message": "ok",
"data": {
"data": "57b049c3fdf24e309043f28139b44d05", // Returns the spider task ID; this ID along with the page data will be pushed to the receiving service upon successful scraping
"bizCode": 0, // Business status code
"bizMsg": "ok" // Business status message
}
}
Java – OKHttp Example
// Request
OkHttpClient client = new OkHttpClient.Builder()
.build();
MediaType mediaType = MediaType.parse("application/json");
RequestBody body = RequestBody.create(mediaType, "{\"url\":\"https://www.amazon.com/s?k=baby\",\"callbackUrl\":\"http://***.***.***.***/callback/data\",\"bizContext\":{\"zipcode\":\"90001\"}}");
Request request = new Request.Builder()
.url("http://scrape.pangolinfo.com/api/task/receive/v1?token=xxx")
.method("POST", body)
.addHeader("Content-Type", "application/json")
.build();
Response response = client.newCall(request).execute();
// Response
{
"code": 0, // System status code
"message": "ok",
"data": {
"data": "57b049c3fdf24e309043f28139b44d05", // Returns the spider task ID; this ID along with the page data will be pushed to the receiving service upon successful scraping
"bizCode": 0, // Business status code
"bizMsg": "ok" // Business status message
}
}
Python – Requests Example
# Request
import requests
import json
url = "http://scrape.pangolinfo.com/api/task/receive/v1?token=xxx"
payload = json.dumps({
"url": "https://www.amazon.com/s?k=baby",
"callbackUrl": "http://***.***.***.***/callback/data",
"bizContext": {
"zipcode": "90001"
}
})
headers = {
'Content-Type': 'application/json'
}
response = requests.request("POST", url, headers=headers, data=payload)
print(response.text)
# Response
{
"code": 0, // System status code
"message": "ok",
"data": {
"data": "57b049c3fdf24e309043f28139b44d05", // Returns the spider task ID; this ID along with the page data will be pushed to the receiving service upon successful scraping
"bizCode": 0, // Business status code
"bizMsg": "ok" // Business status message
}
}
Receiving Service Example
Need help?
We are devoted to your success, don't hestitate to contact us for any kind of questions!
Our team of experts is committed to helping you troubleshoot and fix any issue that you might experience with our products.
If you want to file a bug report or need technical assistance, be sure to reach our support team by sending us an email. Or consult technical documentation. [Scrape API User Guide]