Octoparse vs. Content Grabber: which is better

As there are different web scrapers, a problem appeared: which one is the best custom scraper focused on our specific needs and scrapes everything? Most off-the-shelf web scrapers are often quite generic and are mostly designed to perform a common and simple task. That being said, they may not appear to be as flexible and universal as you’d expect. So in this post, I will compare the web scraper Octoparse and Content Grabber to give you some insights before choosing the web scraping service that will serve you for a long time for data extraction.

Feature Comparison

Here is a form of the features comparison between Octoparse and Content Grabber:

Feature	Octoparse	Content Grabber
General Rule
Authoring environment	The windows-based software application (available for MAC with the virtual machine)	The windows-based software application (available for MAC with the virtual machine)
Smart Mode	Yes, getting extracted data just by entering the target URL	No
Cloud service	Yes	No
Scraper logic	Variables, loops, conditionals	Variables, loops, conditionals
Speed	Fast parallel execution	Fast parallel execution
Hosting	Hosted on a cloud of Octoparse servers if subscribed to Octoparse cloud or on the local machine	Local machine
Selecting elements	Point-and-click, XPath	Point-and-click, XPath
Transforming data	Regular expressions, string operations	Regular expressions
Speed	Fast parallel execution	Fast parallel execution
Knowledge of HTML and HTTP	Not required	Required
Knowledge of Regular expression and XPath	Not necessary, but would be better for further exploration	Not necessary, but would be better for further exploration
Features Extraction
Javascript, Ajax and dynamic content extraction	Yes	Yes
Pop-ups, infinite scroll, hover contents, tabs, logging in	Yes	Yes
Pagination	Yes	Yes
Entering into search boxes	Yes	Yes
Capture text, links, files, meta tags, HTML and much more	Yes	Yes
Copy and paste commands, drag and drop commands	Yes	Yes
Pre-configured crawlers for commonly scraped websites	Yes	No
PDF and Excel extraction	No	Yes by using 3rd party document converters
Image and videos extraction	No, only able to extract the image or file URLs	Yes
IP Rotation	Included in paid plans or manual IP proxy	Yes by using 3rd party proxy rotation service Nohodo
CAPTCHA	Yes, on the local machine	Yes, with a 3rd party CAPTCHA recognition service account
Website crawler function	Yes	Yes
Run-time configuration	With a premium Octoparse account	With a premium import.io account
Remove duplicate data	Yes	Yes
Track changes on a website	Yes (Incremental extraction)	Yes
RegEx tool and XPath tool	Yes	No
Command-line	No	Yes
Data Export
Data export	CSV, Excel, TXT, Databases	CSV, Excel, JSON, PDF, Databases
API	Yes	Yes
Support
Debugging	Yes, with limited functionality	Yes
Support	Free professional support, tutorials, community support	Paid service

So what could Octoparse and Content Grabber both do for you?

Octoparse offers most of the web scraping power and scale of Content Grabber in a much easier-to-use package. Content Grabber is designed to work at a higher level where most of the features of Octoparse are bundled together.

Both Octoparse and Content Grabber stand for the new visual web scraper on the market. They both have a simple-and-click UI where users browse the website and click on the data elements in order to collect them.

Like a bot, they could follow the links to go into the deeper web pages by clicking the items and extracting the data on the other pages. They both offer API options, IP rotation, and services to schedule extractors running in real-time. Also, they are able to get data in CSV format and transform data by manually modifying Regular expression.

What’s more, they can be instructed to do more than just extract data. They have a variety of options to choose from, making it possible to get data from interactive websites. You can instruct them to scrape data from very complex and dynamic sites because they can:

Sign in to accounts
Select choices from the dropdown menu, pop-up, hover
Search using the search bar
Go to a new page simply by clicking on the “next” button
Get data from infinitely scrolling pages and other dynamic webpages
…

This means that these two web scrapers can be as flexible and universal as you’d expect. They can deal with:

Difficult tables, like merged tables, tables with an indefinite number of columns, missing values and so on.
Difficult block layouts, especially those in which there is no direct HTML association among the data presented on a screen, like extracting all the products skipping advertisements, scraping discounted products only.
Test list when the HTML DOM structure is plain.
Invalid HTML: Unscaped characters, non-HTML tags, unclosed tags, unmatched quotes, missing spaces, invalid tag nesting.
Scrape behind the login. Both scrapers can submit a login form via POST, HTTP 302 redirects to outwork and cookie storing performance.
CAPTCHA solving.

Both data extraction tools have a lot of functionality to extract all kinds of websites if you could fully explore their functionality. As a fan of Content Grabber, I would recommend Content Grabber for a few situations:

Tight integration with existing python codebase and infrastructure via API
Advanced debugging tool
Third-party Captcha solution

We are working on solving the second issue to make Octoparse more humane.

However, if you are starting out, we encourage you to try Octoparse which will get you up a

Cost Comparison

At first glance, the main difference between the two services appears to be their pricing. Octoparse packages capabilities into conventional software-as-a-service (SaaS) plans Free, Standard ($89) and Professional ($189).

Content Grabber is a paid service. There are two purchasing methods for Content Grabber users: buying a license and monthly subscription. The license version (three editions) outright gives you a perpetual license, pricing from $449 to $2495. The monthly subscription will be charged upfront each month. There are also three editions pricing from $69 to $299.

Brand	Octoparse			Content Grabber
Brand	Basic	Standard	Professional	Server	Professional	Premium
Monthly plan ($)	Free	89	189	69	149	299
Yearly plan/License($)	Free	900	1896	449	995	2495

The big difference between Octoparse and Content Grabber premium plans is that there are no limited licenses and users for Octoparse. That’s to say, more than one user could use Octoparse at different computers with the same premium account. Content Grabber is licensed per user per computer. This means you need a license for each computer where Content Grabber is installed, and if the computer is accessed by more than one user, you need a license for each user using the software on the computer. Also, one license does not cover both your desktop computer and your laptop, or both your office computer and your home computer.

You can see that the Octoparse free plan grants powerful functionality without defining how many web pages you could extract for one task. The higher version mainly offers more tasks and faster speed for more money and IP rotation. Also, only the premium plans enable you to schedule the crawlers and run the crawlers on a regular basis.

For Content Grabber, versions are different from different functionality: export function, API, self-contained agents, etc. Charges are also different for maintenance and support.

If you don’t want to learn how to use a tool and just want your data on demand, both Octoparse and Content Grabber provide data service extracting data for you. Just contact the sales of both companies and they will scrape data from the website you want.

Conclusion

Octoparse and Content Grabber

Like the earlier comparison, Octoparse vs Content Grabber is somewhat of an apple-to-orange comparison. Content Grabber is designed to work at a higher level where most of the features of Scrapinghub are bundled together. If you are just starting out, we encourage you to try Octoparse which will easily get you up with a free version or at a much lower cost.