Web Scraping|Scrape Booking Reviews

12/28/2016 9:10:38 PM

 

(picture from www.luxurybackpacker.com)

 

Collecting online customer reviews, including star ratings, comments, likes, dislikes, images, videos, share channels and etc, can help an online retailer to better understand if the product sold is a good purchase and popular among customers, thus to adjust marketing strategies. There are many web scraping tools available online to live up to your expectations to scrape data from websites.

In this article we will talk about the key points to scrape customer reviews about the hotels in Tbilisi City from booking.com with Octoparse. We won’t provide specific steps for making the scraping task and if you want to learn how to make such a scraping task or want to get other types of customer reviews from booking.com, we offer the extraction services for you to suit the needs. Please contact us via support@octoparse.com.

 

We’ve made the scraping task and you can directly download the .otd file (What is an OTD. file?) to begin collect the hotel reviews from booking.com. (Download my extraction task of this article HERE just in case you need it.) 

The OTD. file is available only in Octoparse. You can Download Octoparse before downloading the scraping task.

 

Please click HERE to open the website URL we used.

The data fields include hotel name, hotel address, star rating, customer name and comments posted by the customer.

The scraping task we’ve made in Octoparse is looked like this.

We will go to the detail page of each hotel and get the reviews under the “Read all trusted reviews” tab.

 

 

Since sometimes the actual number of reviews are more than what is shown on the detail page,  we will need to get all the reviews from all the countries displayed. Therefore, we clicked the plus button to display all the countries in which the consumer were located.

 

In Octoparse, we will create a list of items to extract all the countries. The Xpath for the loop will extract extra elements from the web page so we need to modify the XPath and let the XPath expression to select the elements correctly.

 

Since we all know that all the elements will be extracted by clicking the elements when you create a list of items in Octoaprse, and the booking.com website will select the first country in the pop-up window by default, thus the first country will therefore be unselected when you create a loop for these countries.

In this case, we need to select the first country by clicking the checkbox of the first country and Octoparse will generate a “Click Item” action in the rule.

 

All the customer reviews about the hotel will be extracted by countries.

Since there are anonymous customer accounts and reviews, so the extraction output will have duplicate data records. You can export the data by choosing only the valid data.

 

 

 

 

 

Author: The Octoparse Team

 

 

 

Download Octoparse Today

 

 

 

For more information about Octoparse, please click here.

Sign up today!

 

Author's Picks

 

Be the Best Junior Management Consultant: Skills You Need to Succeed

How to Get Data from the Web

A Must-Have Web Scraper for Data Comparison Software - Octoparse

10 Best Free Tools for Startups - Octoparse

The Best Answers to Your Most Crucial Deep Learning Questions

Top 30 Free Web Scraping Software

Web Scraping - Scrape Web Pages with Load More Button

 

 

Request Pro Trial Data
Collection
Service
Email
us

Leave us a message

Your name*

Your email*

Subject*

Description*

Attachment(s)

Attach file
Attach file
Please enter details of your issue and we will get back to you ASAP.
× get my coupon now No Thanks