Scheduled Data Extraction - Octoparse Cloud Web Scraping ServiceMonday, November 28, 2016 3:44 AM
Octoparse Cloud Service provide options of scheduling data Extraction for a task. You can set up the data extraction schedule for running your scraping tasks on Octoparse cloud platforms.
Regardless of the location, your Octoparse server is set to the UTC-4 (Coordinated Universal Time). The current version of Octoparse does not support changing the timezone. So you need to adjust the schedule time accordingly when you are setting up a schedule.
(For more information about UTC, please check out the website: http://en.wikipedia.org/wiki/Coordinated_Universal_Time)
The following steps describe the process of scheduling a cloud scraping task, starting at the point when you complete creating your task that is ready for scraping web data.
After you complete configuring your task, select the option “Schedule Cloud Extraction Settings” to begin the scheduling process.
1. Set the Parameters
In the “Schedule Cloud Extraction Settings” dialog box, you can select the Periods of Availability for the extraction of your task and the Run mode - running your periodic tasks to collect data with varying intervals.
· Periods of Availability - The data extraction period by setting the Start date and End date.
· Run Mode - Once, Weekly, Monthly, Real Time
There are four types of Run mode to set the schedule.
· Run Mode - Once
To run the task at some specific times on a selected day: Select the specific day during the period of availability, and then select the time of day from the lists. (Octoparse will choose the 00:00 by default if you skip this part.)
· Run Mode - Weekly
To run the task at some specific times on selected days each week: Specify the days of the week, and then select the time of day from the lists. (Octoparse will choose the 00:00 by default if you skip this part.)
For example, specify Monday, Thursday and Friday of each week, run at 9:00 and 16:00.
· Run Mode - Monthly
To run the task at some specific times on selected days each month: Specify the days of the month, and then select the time of day from the lists. (Octoparse will choose the 00:00 by default if you skip this part.)
For example, specify 11th, 16th, 17th, 18th and 23rd for each month, run at 9:00 and 16:00.
· Run Mode - Real Time
To run the task at an interval from now on: Specify the interval to run the task.
For example, run the task once every 30 minutes, from now on. Assuming the current time is 9:00 a.m. and after you click the Start button, your task will start running at 9:30 a.m. and will be executed every 30 minutes.
2. Manage your schedules
Before starting your schedule, you can save or cancel your settings.
Start your schedule
After you set the parameter for a schedule of your task, you can
1. Click the Start button to start you schedule of the task.
2. Click the OK button to check the “Cloud: Waiting”list and see if the task is scheduled.
Disable your schedule
Double click a scheduled scraping task in the “My Task” list directly to open it, or find a scheduled scraping task in "Cloud: Waiting" and "Cloud: Stopped" under the "Task Status" of Octoparse.
All the tasks in "Cloud: Waiting" and the tasks with "Scheduled Execution Time" in "Cloud: Stopped" are scheduled scraping tasks. Double click the task in these task lists to open the task and go to the last step - Done.
After you open the task, go to the last step - Done.
You will see the scheduled status of your task in the “Schedule Cloud Extraction Settings” option.
1. Click this option to edit your schedule.
2. Click the Stop button.
3. Click OK to disable the schedule.
Edit an existing Schedule
To edit an existing schedule, you can find the task in the “My Task” list and open the task. Go to the last step - Done. You will see the status of your task in the “Schedule Cloud Extraction Settings” option. Click this option to reschedule your tasks.
Check the status of your scheduled tasks
1. Find the “Task Status”option on the left pane
2. Select the “Cloud: Waiting”option
You will see all the waiting tasks here.