Blog > Post

What is Configuration Rule in Octoparse?

Friday, September 4, 2020

Genarally, web crawlers like Google will retrieve all of the webpages. It can find links and content inside (usually text) to make sure what they are and in this way it can index the search pages.

But crawlers run in Octoparse are determined by the rules configured, and the data extracted is structured. It does not understand the web content with advanced algorithms, but it grabs the exact web content to you perfectly.

Today we’ll talk about what an Octoparse RULE is.


The extraction rule is one of the most important features of Octoparse. The rule configured would tell Octoparse: which website is to be open; where is the data you plan to crawl; what kind of data you want, etc.

You can configure the rule to paginate, to scrape a website behind a login, to collect data from webpages loaded with AJAX, to scrape a website with infinite scrolling. But, you have to make these happen by making a rule.

Trust me. It's very easy. If you can use a web browser, you can use Octoparse. Moreover, Octoparse has a visible workflow designer to show how the rule is created.

You do not need to write any code in Octoparse. Just tell Octoparse what you want it to do by dragging actions into the workflow designer and selecting options to optimize the process.


Let’s take an example of a simple web page extraction with pagination.



 Happy Data Hunting!









Author: The Octoparse Team




Download Octoparse Today



For more information about Octoparse, please click here.

Sign up today.



Author's Picks


About Octoparse

A Comparison among Three Editions of Octoparse

Octoparse 6.0 is Now Available

What A Price Monitor Can Help you?

Collect Data from Amazon

Collect Data from eBay

Collect Data from LinkedIn

Collect Data from Gumtree.com






Download Octoparse to start web scraping or contact us for any
question about web scraping!

Contact Us Download
We use cookies to enhance your browsing experience. Read about how we use cookies and how you can control them by clicking cookie settings. If you continue to use this site, you consent to our use of cookies.
Accept decline