Access to The Data Extracted in the Cloud via APIMonday, March 7, 2022
Cloud Extraction is one of Octoparse's outstanding features
Octoparse provides you with multiple cloud servers to extract web data simultaneously. You can access to the extracted data in the cloud via API.
(Note: Here is a video about Cloud Extraction. It is not available in the Free Edition.)
You can access to the extracted data and download them via the URL,
To get this URL, click the “Create an API” option, as shown below.
Then, you’ll see the URL will be displayed in the pop-up screen.
Click your rule file, and choose “Start Cloud Extract” option.
Immediately the cloud management system will read the rule and send out a command to assign cloud servers to extract data in accordance with the rule.
During the process, you can see the data extraction progress clearly.
If you are using Standard Edition, you can get four cloud servers to help you extract data at the same time.
If you are using Professional Edition, you can get 10 cloud servers.
（If you want more cloud servers, please feel free to contact our support team: firstname.lastname@example.org
In the process of data extraction, you can choose “View Data” option to check out the data extracted.
And of course, you can also access them via API as described above.
An API URL includes three parameters, KEY, BeginTime, and EndTime.
API Key is the unique identifier of each task for verifying ID and authority. (Don’t leak or modify the key. )
The begin time and EndTime are set by yourself. But both time are using the universal time, UTC-05:00.
The format of API is XML.
Before you use the API URL, you need to modify the content in the two brackets (BeginTime and EndTime) manually.
BeginTime: the starting datetime of Extraction. Format: yyyy-mm-dd hh:mm:ss
EndTime: the ending datatime of Extraction. Format: mm-dd-yyyy hh:mm:ss
Author: The Octoparse Team