logo
languageENdown
menu

Octoparse vs. Content Grabber comparison: which is better for web scraping?

6 min read

As there are different web scrapers, a problem appeared: which one is the best custom scraper focused on our specific needs and scrapes everything? Most off-the-shelf web scrapers are often quite generic and are mostly designed to perform a common and simple task (refer to Top 5 Web Scraping Tools Review for more information). That being said, they may not appear to be as flexible and universal as you’d expect. So in this post, I will compare the web scraper Octoparse and Content Grabber to give you some insights before choosing the web scraping service that will serve you for a long time for data extraction.

Feature Comparison

Here is a form of the features comparison between Octoparse and Content Grabber:

FeatureOctoparseContent Grabber
General Rule
Authoring environmentThe windows-based software application (available for MAC with the virtual machine)The windows-based software application (available for MAC with the virtual machine)
Smart ModeYes, getting extracted data just by entering the target URLNo
Cloud serviceYesNo
Scraper logicVariables, loops, conditionalsVariables, loops, conditionals
SpeedFast parallel executionFast parallel execution
HostingHosted on a cloud of Octoparse servers if subscribed to Octoparse cloud or on the local machineLocal machine
Selecting elementsPoint-and-click, XPathPoint-and-click, XPath
Transforming dataRegular expressions, string operationsRegular expressions
SpeedFast parallel executionFast parallel execution
Knowledge of HTML and HTTPNot requiredRequired
Knowledge of Regular expression and XPathNot necessary, but would be better for further explorationNot necessary, but would be better for further exploration
Features Extraction
Javascript, Ajax and dynamic content extractionYesYes
Pop-ups, infinite scroll, hover contents, tabs, logging inYesYes
PaginationYesYes
Entering into search boxesYesYes
Capture text, links, files, meta tags, HTML and much moreYesYes
Copy and paste commands, drag and drop commandsYesYes
Pre-configured crawlers for commonly scraped websitesYesNo
PDF and Excel extractionNoYes by using 3rd party document converters
Image and videos extractionNo, only able to extract the image or file URLsYes
IP RotationIncluded in paid plans or manual IP proxyYes by using 3rd party proxy rotation service Nohodo
CAPTCHAYes, on the local machineYes, with a 3rd party CAPTCHA recognition service account
Website crawler functionYesYes
Run-time configurationWith a premium Octoparse accountWith a premium import.io account
Remove duplicate dataYesYes
Track changes on a websiteYes (Incremental extraction)Yes
RegEx tool and XPath toolYesNo
Command-lineNoYes
Data Export
Data exportCSV, Excel, TXT, DatabasesCSV, Excel, JSON, PDF, Databases
APIYesYes
Support
DebuggingYes, with limited functionalityYes
SupportFree professional support, tutorials, community supportPaid service

So what could Octoparse and Content Grabber both do for you?

Octoparse offers most of the web scraping power and scale of Content Grabber in a much easier-to-use package. Content Grabber is designed to work at a higher level where most of the features of Octoparse are bundled together.

Both Octoparse and Content Grabber stand for the new visual web scraper on the market. They both have a simple-and-click UI where users browse the website and click on the data elements in order to collect them.

Like a bot, they could follow the links to go into the deeper web pages by clicking the items and extracting the data on the other pages. They both offer API options, IP rotation, and services to schedule extractors running in real-time. Also, they are able to get data in CSV format and transform data by manually modifying Regular expression.

What’s more, they can be instructed to do more than just extract data. They have a variety of options to choose from, making it possible to get data from interactive websites. You can instruct them to scrape data from very complex and dynamic sites because they can:

  • Sign in to accounts
  • Select choices from the dropdown menu, pop-up, hover
  • Search using the search bar
  • Go to a new page simply by clicking on the “next” button
  • Get data from infinitely scrolling pages and other dynamic webpages

This means that these two web scrapers can be as flexible and universal as youd expect. They can deal with:

  • Difficult tables, like merged tables, tables with an indefinite number of columns, missing values and so on.
  • Difficult block layouts, especially those in which there is no direct HTML association among the data presented on a screen, like extracting all the products skipping advertisements, scraping discounted products only.
  • Test list when the HTML DOM structure is plain.
  • Invalid HTML: Unscaped characters, non-HTML tags, unclosed tags, unmatched quotes, missing spaces, invalid tag nesting.
  • Scrape behind the login. Both scrapers can submit a login form via POST, HTTP 302 redirects to outwork and cookie storing performance.
  • CAPTCHA solving.

Both data extraction tools have a lot of functionality to extract all kinds of websites if you could fully explore their functionality. As a fan of Content Grabber, I would recommend Content Grabber for a few situations:

  • Tight integration with existing python codebase and infrastructure via API
  • Advanced debugging tool
  • Third-party Captcha solution

We are working on solving the second issue to make Octoparse more humane.

However, if you are starting out, we encourage you to try Octoparse which will get you up a

Cost Comparison

At first glance, the main difference between the two services appears to be their pricing. Octoparse packages capabilities into conventional software-as-a-service (SaaS) plans Free, Standard ($89) and Professional ($189).

Content Grabber is a paid service. There are two purchasing methods for Content Grabber users: buying a license and monthly subscription. The license version (three editions) outright gives you a perpetual license, pricing from $449 to $2495. The monthly subscription will be charged upfront each month. There are also three editions pricing from $69 to $299.

BrandOctoparseContent Grabber
BasicStandardProfessionalServerProfessionalPremium
Monthly plan ($)Free8918969149299
Yearly plan/License($)Free90018964499952495

The big difference between Octoparse and Content Grabber premium plans is that there are no limited licenses and users for Octoparse. That’s to say, more than one user could use Octoparse at different computers with the same premium account. Content Grabber is licensed per user per computer. This means you need a license for each computer where Content Grabber is installed, and if the computer is accessed by more than one user, you need a license for each user using the software on the computer. Also, one license does not cover both your desktop computer and your laptop, or both your office computer and your home computer.

You can see that the Octoparse free plan grants powerful functionality without defining how many web pages you could extract for one task. The higher version mainly offers more tasks and faster speed for more money and IP rotation. Also, only the premium plans enable you to schedule the crawlers and run the crawlers on a regular basis.

For Content Grabber, versions are different from different functionality: export function, API, self-contained agents, etc. Charges are also different for maintenance and support.

If you don’t want to learn how to use a tool and just want your data on demand, both Octoparse and Content Grabber provide data service extracting data for you. Just contact the sales of both companies and they will scrape data from the website you want.

Conclusion

Octoparse and Content Grabber

Like the earlier comparison, Octoparse vs Content Grabber is somewhat of an apple-to-orange comparison. Content Grabber is designed to work at a higher level where most of the features of Scrapinghub are bundled together. If you are just starting out, we encourage you to try Octoparse which will easily get you up with a free version or at a much lower cost.

Hot posts

Explore topics

image
Get web automation tips right into your inbox
Subscribe to get Octoparse monthly newsletter about web scraping solutions, product updates, etc.

Get started with Octoparse today

Download

Related Articles