Pagination Scraping: Configure “Loop click next page” When It Can’t Be Detected

Monday, May 30, 2016 8:16 AM

Brief Intro

In this tutorial, I’ll take realtor.com for example to show you how to configure pagination scraping rule when the “Next” button can't be detected. The URL we use for this example is

http://www.realtor.com/apartments/San-Francisco_CA

Usually you find “Loop click next page” when you configure pagination by clicking on “Next page”.

But sometimes you can’t find the “Loop click next page” option as usual.

 

List features covered

Some features that we will touch upon include:

  • Pagination
  • Modify XPath
  • Building a list

 

Now, let’s get started!

As we open the target webpage, we can observe that there is no "Next" button for us to select "Loop click next page".  In this case, we need to configure pagination scraping rule in another way.

 

Step 1. Create a list of items

  • Click the Right-Arrow  to the rightmost of the sequential pagination numbers
  • Select "Create a list of items"

 

 

 

Step 2. Switch Loop Mode to Single Element

      To loop click next page, we still use loop action for pagination functionality.

  • To do this, drop an “Loop” item into Workflow designer.
  • Choose an "Loop Mode" under "Advanced Options".
  • Select "Single Element" option.

 

 

Step 3. Modify the XPath for pagination

  • To locate the pagination link, we need to inspect its XPath in Firepath.
  • Make sure you locate pagination link  to the rightmost.
  • Then click the XPath of the right arrow button  

      .//*[@id='ResultsPerPageBottom']/nav/span[6]/a/i inspected in Firepath, and paste it in the text box.

  • Click “Save”.

 

 

Step 4. Click to Paginate

  •    Drag a “Click element” action into the “Loop item”
  •    Choose “Click Loop items” under "Advanced Option".

           Then, Octoparse will click the pagination link in sequential loop.

 

 

Good job for completing this tutorial!

 

Now, learn more about pagination and list building (including tips and troubleshooting):

 

 Scrape Data from Websites with Pagination (Query Strings) (1)

 Scrape Data from Websites with Pagination (Query Strings) (2) - No "Next Button" Found

 Pagination Loop issue - The extraction stops after 3 pages

 Why I couldn’t create a loop for pagination even if I expand the selection area of the element(the Next Page link)?

   

 

Now you’ve configured pagination crawling. Feel free to contact us at support@octoparse.com

 

Download Octoparse and try it now!

If you like this video, please thumb up and subscribe our channel.

Join us on Facebook, Spaces, Pinterest and share your ideas with us!

 

 

 

btn_sidebar_use.png
btn_sidebar_form.png