Web Crawling Case Study | Scraping ASTA with Pagination (2) - No "Next Button" FoundWednesday, April 05, 2017 2:44 AM
In the tutorial Scraping from multi-pages: pagination without "Next" button, we have learnt how to flip pages without a "Next" button.
In this tutorial, I will take ASTA website for an instance to show you how to scrape data from websites with pagination without "Next Button" step by step.
List features covered
Now, let's get started!
Step 1. Set up basic information and navigate to the target website
Step 2. Find the pages to scrape
In this website, the searching content will not be displayed until you click an item to prompt searching by yourself.
Step 3. Set up Pagination
Step 4. Modify XPath to locate next page
Step 5. Click Items in the loop to scrape data with pagination
Now you’ve configured pagination scraping.
Step 6. Create a list of items
Move your cursor over the article with similar layout, where you would extract the content of the article.
If The selection had not been identified properly in the first place.
Now, the first item has been added to the list, we need to finish adding all items to the list
Now we get all the sections added to the list with similar layout
Step 7. Select the data to be extracted and Rename data fields.
Step 8. Re-order workflow
Notice that the loop action for data extraction is positioned outside of the loop for pagination. This doesn’t make sense, right? Since we want to extract from each page before turning to the next page. So, we’ll need to manually drag the data extraction loop to the inside of the pagination loop, position it right before “Click to paginate” action in the workflow designer.
Now, look at the workflow we created, extract and turn page, then loops back to extract and turn page, exactly what we want.
Step 9. Starting running your task
Octoparse will automatically extract all the data selected. Check the "Data Extracted" pane for the extraction progress
Step 10. Check the data and export
Now you've learned how to flip through pages to scrape data without "Next" button. Let’s look into how pagination works with this example.
Now check out similar case studies:
Or, learn more about pagination related topics:
Author: The Octoparse Team
For more information about Octoparse, please click here.
Sign up today!