undefined

Extract Reviews - Dealing with "Show More" Buttons

Wednesday, December 20, 2017 7:53 AM

Many websites use a "Load More" or "Show More" button to load content in a continuous manner. This technique is very commonly used by websites for creating a better user experience.

 load more in web scraping 1

Unlike pagination with a "Next" button, the "Load More" button keeps adding more content onto one single web page, which makes it trickier to scrape. In this article, I will show you how to deal with the "Load More" button in Octoparse.

 

1. Use Auto-detect to deal with the "Load More" button

2. Create a pagination action manually

 

You may need this example link to follow through:

https://www.capterra.com/search/category?search=CRM%20Software

 

1. Use Auto-detect to deal with the "Load More" button

  • Start the Auto-detect process and you will be provided with the option to Click on a "Load More" button on the Tips Panel. 

b

  • Click "Check" to see if the load more button has been located correctly. If not, you can click "Edit" to choose the right button
  • Click "Edit" to set up the number of clicks, which is how many times you want to click on the Load More button

c

  • Click "Create workflow" to generate the settings

The workflow should look like the following image:

d

 

With this workflow, Octoparse will click the "Load More" button along with extracting data. If the "Number of clicks" is set to 20, and every time you click the "Load More" button it will load 20 new items, Octoparse will extract the newly loaded 20 items each time with each clicking on the "Load More" button.

 

2. Create a pagination action manually

  • Select the "Load More" button on the web page and choose "Loop click single element"
  • Set up a proper AJAX timeout (what is AJAX?)

 

Tips!

1. If you only wish to click the "Load More" button for X number of times, click Pagination box, tick "Repeats" and set Repeats to the number X.

 e

 

2. If you find that the task gets many duplicates during scraping, you can drag the Loop Item out of the Pagination so that Octoparse will start to scrape after loading all the items.

 

 

If you have any questions, you are welcome to submit a request here. Our support team will get back to you within 24 hours. 

 

 

Author: The Octoparse Team

We use cookies to enhance your browsing experience. Read about how we use cookies and how you can control them by clicking cookie settings. If you continue to use this site, you consent to our use of cookies.
Accept decline