undefined
Blog > Octoparse > Post

Octoparse 8.5: Empowering Local Scraping and More

Wednesday, February 16, 2022

Here is the exciting news: Octoparse 8.5 is now released with game-changing new features and major improvements. Previously we all know that we can count on cloud scraping when it comes to scraping fast at scale, but this time, we want to make local scraping just as competitive.  

 

 

 

What's new in Octoparse 8.5?

Scraping Speed, Ease of Use, and Secure Data Storage are essential elements to a web scraping tool and its users. These are what Octoparse 8.5 is designed to focus on.  

For this update, most of the work goes to Local Run/Local Scraping (compared to Cloud Scraping), dashboard task management, and some smaller optimizations such as Switch Cloud IP for a task & Time Zone Conversion.

 

💡Tips:
  1. Though the main updates are covered in this article, there is more to be explored. Here is a comprehensive version of Octoparse 8.5 updates plus technical guides
  2. Why do we focus on Local scraping? Cloud scraping is powerful but not always omnipotent. Making local scraping just as flexible and powerful can greatly complement cloud scraping and all together they will make Octoparse a much more powerful web scraping tool, and create a seamless scraping experience for Octoparse users like you. 

   

So there's a new release, what's in it for me?

If any of the below voices resonate with you, you will find the Octoparse 8.5 updates extremely helpful. 

  • Cloud scraping is cool and I rely more on local runs to get the data.
  • I need the local scraping to go faster!
  • I want the local run data to be sent to my database automatically just like the cloud run data.
  • I need individual batches of data for all my runs.

 

>> Check these updates

 

  • I get frustrated when I don’t know why my task doesn't work and I have no idea how to fix it.
  • I'd like to pause the task for a while just to check things up and see if the data's been extracted accurately. 
  • I wish there was a way to manage my tasks more efficiently.

 

>> Check these updates

 

The rest of this article will lead you in and help you get a hang of the 8.5 new features faster. Let’s dive right in!

 

  

Live Logs for troubleshooting local runs

With Octoparse 8.5, you can now

  • Check real-time logs for local runs (for task inspection)
  • Pause & resume a local run when needed

 

Whether you are new to Octoparse or if you've already played around for a while, it's always difficult to find out why your task is not working as expected. And without knowing the causes, fixing it can be a nightmare. With the new Octoparse 8.5, you'll now be provided with the Error Log which tells you to your face what went wrong and where did it get stuck, such that fixing the problem becomes much easier as the problem gets spotted. No more guesswork. 

If your task fails, tick on the "show error logs only", the logs will tell you exactly why the scraper gets stuck and what goes wrong during the scraping process. The error logs give a direct answer to how you shall fix your scraper and make it work again.

 

error log

 

 

Now you know what the problem is. Just shoot it away!
Here are a few errors you may encounter and some approaches to fix it.

  • A certain element not found - time to check your Xpath!
  • Fail to load the webpage - check if anything wrong with your network or IP?
  • AJAX timeout - increase your timeout limit

 

💡Tips:

The logs will no longer be accessible if you close the local run window after the task is completed. If you need a second look at the logs or the errors, don’t forget to export the logs.

 

Boost mode for 3X faster local runs

Yes, Cloud scraping is fast and efficient. Yet, as the "Boost Mode" for local scraping comes along, speed is not the privilege for cloud scraping any more! Octoparse 8.5 introduces "Boost Mode" for local extraction for up to 3X faster extraction as the task splits itself into multiple subtasks that run concurrently. As a results, you'll get your data much faster. 

 

boost mode

 

 

Well, there are a few notes to be made with "Boost Mode".

  • Boost mode is only applicable to tasks that are built with "splitable" loop such as a list of URLs, a list of text items, or a fix list of page elements. 
  • The exact number of tasks that you can run on your desktop in Boost Mode is highly dependable on the capacity of your device. 


If local extraction is what you use, "Boost Mode" can take your web scraping experience to a next level. To some extent, it closes the gap between local run and cloud run by making a local run as fast and scalable as cloud run can be.

 

Read related tutorial: What's the difference between Standard Mode and Boost Mode?

 

Auto-backup local data to the Cloud

With Octoparse 8.5, you can now

  • Access historical data for each run on your local device
  • Auto back-up local run data to the cloud

 

local scraping automation

(If you are interested in setting your local scraping automation process with Octoparse 8.5,

contact support@octoparse.com for a free trial & more details.) 

 

With the previous version, Octoparse only keep the last set of data for any local runs. As the Local Run History went live, you are now able to access every batch of data you have scraped with the same task. For example, if you run task A four times a week, all four batches of data will be stored indiviudally and assessible in your account. 

  

history run

 

 

Additionally, you can turn on the Auto Backup so that Octoparse will store your data in the Cloud after each run is completed. This is extremely helpful if you are using API to connect data to your database. In this way, you will be able to process not only cloud-run data but also local-run data on your side.

 

💡Tips

Switching on the Auto Backup will not trigger data backup of any previous runs to the Cloud, but only the data extracted for the later runs. If a run is completed and the batch of data hasn't been backed up to the Cloud yet, you can still backup the data to the Cloud manually.

 

 

Manage your task with batch actions 

This particular update with the Dashboard aims to cut back repetitive work and make task management easier, especially for those that have a large list of tasks to take care of. 

 

With Octoparse 8.5, you can now

  • Manage multiple tasks at once using batch actions, such as duplicate task, stop cloud runs, schedule local runs, and etc.
  • Sort/filter your tasks more efficiently using the new parameters included in the filters. You can even save the filter settings for later use. 

 

dashboard filter 

 

💡Tips

While the main updates are included in this article, there are more to be explored. Here is a comprehensive version of Octoparse 8.5 updates plus technical guides.

 

Summary and further help

Beside all the above, there are still improvement to be discovered as you fiddle with the brand new 8.5 version yourself. If you have any problems or feedback with Octoparse 8.5 and would like to talk to us, feel free to contact us at support@octoparse.com.

 

More step-by-step tutorials (for Octoparse 8.5 updates) are coming up:

 

Author: Cici

Editor: Isabella 

Related resources

Run tasks on your device

Why cloud run gets no data

Cloud-based web scraping applications

Acess to cloud data via API

 

We use cookies to enhance your browsing experience. Read about how we use cookies and how you can control them by clicking cookie settings. If you continue to use this site, you consent to our use of cookies.
Accept decline