All Collections
Octoparse 101
Lesson 5: Get your data
Lesson 5: Get your data
Updated over a week ago

Now that your first scraping task is built and fully tested. You can go ahead and run the task to extract some real data.



1. Two ways to get data

There are two ways you can run the task:

  • Run on your device (also known as local extraction/local run)

  • Run in the Cloud (also known as Cloud extraction/Cloud run)

If you run a task on your device, you will need to have the Octoparse App open during the extraction process. There will be an extraction window running on your PC, and you can watch the data getting extracted and wait for it to complete.

mceclip0.png

On the other hand, when you run a task in the Cloud, the task will be run on the Octoparse Cloud Platform, which means you can shut off the Octoparse App or even your computer and come back for your data when the job is done. Tasks running in the Cloud generally run 4x to 12x faster compared to local extractions. Depending on your project requirements, you can always choose a plan that works for you.

Note: Some tasks have Standard Mode and Boost Mode options for both local and Cloud runs. You can check the difference between the two modes Standard Mode vs Boost Mode


2. Start a run

Once you are done building a task, you can click the "Run" button to start a run.

1818.png

Alternatively, you can also access the task on the Dashboard and use the Run/Stop buttons to run/stop a task.

Run.jpg

3. Check your data

Now that your run is complete, you can go ahead and check your data.

Go to the Dashboard and find your task. Hover over the number of lines scraped and click on it to check the data scraped from the latest run. Click All Data to check the data scraped from all the runs.

lines.jpg

Or, you can also check all the data by clicking the ... icon on the Dashboard, selecting View data, and then choosing if you'd like to view Cloud data or Local data.

check_data.jpg

4. Export your data

If the data looks good to go, you can export the data directly by clicking on Export Data in the lower right corner of the Data View tab. Octoparse supports exporting data to Excel, CSV or HTML files or to a database or Google Sheets.

TIPS:

  • Data extracted in the Cloud runs can be accessed on any device as long as you log into your account.

  • Cloud data is only saved for 3 months, after which it will be removed from the Cloud space. Please remember to export the data before it gets removed.

  • If the data amount exceeds 20K data lines, it will be exported to multiple data files (20K lines per file).

  • Local data can only be accessed on the device in which the local extraction was executed.

  • Cloud data of one task will be stored together to remove duplicates. If you run the same task for the second time, you probably see duplicates scraped on the second run.

  • Cloud duplicates will be removed automatically.

FAQ:


Did this answer your question?