Extract Reviews - Dealing with "Show More" ButtonsWednesday, December 20, 2017 7:53 AM
Product reviews are importance resources for both sellers and buyers. Sellers find about how their products are rated by users while buyers generally spend much time wading through pages of reviews in order to find out whether a product is worth buying.
Many Octoparse users are extracting reviews on daily basis. One of the most frequently asked question is how to deal with "load more" button when it is required to make visible of the full review content instead of the first few lines.
It is actually extremely easy to solve this problem in Octoparse: just make a loop to click those "load more" buttons one by one before extracting the reviews.
Let’s look at an example for Walmart (example URL):
Looking through the reviews on Walmart.com, you can easily spot the “Read More” button showing right below some of the reviews.
What we need to do is really to have the program click open all the "Read more" button all together, so we'll have the complete version of all the reviews. Then, we'll proceed with an extract action for all the reviews. Follow the steps below,
*Notice the XPath used here only applies to this particular example. User should find out the suitable XPath to use for different webpages. The selected XPath must be capable of locating all the "Read More" buttons on the page (click here to learn more about XPath)
In this way, Octoparse will click all the "Read More" button before extracting the reviews to make sure all reviews contents are captured completely.
To learn more about scraping reviews, refer to these tutorials:
Author: The Octoparse Team
For more information about Octoparse, please click here.
Sign up today!