logo
languageENdown
menu

Easily Extract Data from the Web| Web Crawler Software Review

5 min read

Octoparase User Review

By Adriano from Italy  – Basic Plan User

We are using Octoparse to scrape pages, and we find it extremely powerful. The free tool is good for users that don’t need to use many functions. After some limits, you can buy the upgrades.  The Wizard mode is simple, the Advanced is a little difficult to use if you are new to it. We would love to use some of the advanced functionality if they could be “moved” to the simple side (considering that the scraper is intended mostly for non-developers…)

The Wizard Mode gives you the possibility to choose between:

  1. List or Table Extraction
  2. List and Detail Extraction
  3. URLList Extraction
  4. Single Page Extraction

Depending on your needs, it could be good to extract all the fields from a single page (Point 4 Single Page Extraction) or extract data in the form of a table (Point 1 List or Table Extraction).

We mainly use list and detail extraction, where you just need to provide a result page of a query on a first step, putting in and defining the list of similar URLs you need to extract. Octoparse will detect the list automatically just after your second selection. On the next step you would need to instruct Octoparse on the fields you wish to extract.

For each field, you can decide to extract the text, the inner HTML, the outer HTML, or the links behind the text (i.e. Email addresses or Internet addresses). We find this functionality extremely good. In the simple mode (free) the speed is based on your computer and on your Internet speed.

If you don’t need to extract more than 2000 pages per time with the Wizard Mode, the free version is good enough. If instead, you upgrade to the most advanced plans, you can use the speed of multiple servers and run many tasks at the same time in the local machine or their cloud service.

(Notes: The support said that there’s no limit to scrape the pages with the Advanced Mode, but I am not quite familiar with that.)

If the List and Detail Extraction doesn’t provide results, we would transfer to use the URL List Extraction because it is difficult for Octoparse to find out the missing URLs on a result page (The new version 6.4.1 provides Extraction Failure Report to find out the missing URLs now). With this functionality, you simply provide the crawler with a list of similar URLs that you need to crawl and the rest will be done automatically.

Hot posts

Explore topics

image
Get web automation tips right into your inbox
Subscribe to get Octoparse monthly newsletters about web scraping solutions, product updates, etc.

Get started with Octoparse today

Download

Related Articles

  • avatarAbigail Jones
    The software is much easier to use, visually appealing, and on going customer support as well as tutorials have been created with the user in mind: Octoparse Web Scraper!
    2017-03-14T00:00:00+00:00 · 2 min read
  • avatarAbigail Jones
    Octoparse is an extremely powerful tool that has optimized and pushed our data scraping efforts to the next level. I would recommend this service to anyone. The price for the value provides a large return on the investment. For the free version, which works great, you can run at least 10 tasks at a time. However, these tasks are ran simultaneously in the background of the computer application. In my opinion buying any of the plans that allow you to use the cloud interface is very helpful and provides a good bit of flexibility. You can close out of the application and know that Octoparse is running on a server somewhere.
    2017-03-14T00:00:00+00:00 · 2 min read
  • avatarAbigail Jones
    There are still a lot of other tutorials I would like to head back to and smooth over to really utilize Octoparse more fully. And also if you are a beginner like me, the step by step tutorial will blow you away. It not only adopts from the traditional sense of tutorial videos or page by page scroll via a table of content. Octoparse literally shows you where to click on button by button within the software. Now to me, that is what I call robust visual learning.
    2017-03-14T00:00:00+00:00 · 2 min read
  • avatarAbigail Jones
    Octoparse has a lot to offer for you daily business needs. It is still the perfect one for blog based extractors which do not need any sort of login, as easy as drinking a cup of tea. You have local extractor to preserve your IP and also cloud plans to get a faster work done.
    2017-03-09T00:00:00+00:00 · 1 min read