Step-by-step tutorials for you to get started with web scrapingDownload Octoparse
How to handle pagination with page numbers?Thursday, August 16, 2018
“Next” button for pagination is not always available and pagination on some websites may be done by page numbers.
So in this case, to extract multiple pages of data, we will need to modify the XPath of “Click to pagination” step and make it always locate the next number.
(e.g. You’re on the #1 page and you would have to locate the #2 page so that it can always click the next page for pagination purpose.)
After we clicked page #1 and created a pagination loop, create a new XPath for “Click to paginate” action. XPath syntax “following-sibling” is used most often for this case to select all the siblings after current node. (Learn more about locating elements with XPath )
Here’s an example XPath:
And next, replace the auto-generated XPath for the pagination loop with the new XPath.
- Most popular tutorials
- Scrape product information from Amazon
- How to download images from a list of URLs?
- Extract multiple pages through pagination
- Scraping info from Craigslist
- Scraping search results from Google Scholar