undefined

Scraping Hotel Reviews from Tripadvisor.com

Friday, December 30, 2016 6:20 AM

For the latest tutorials, visit our new self-service portal. Sharpen your skills and explore new ways to use Octoparse.

 

In this tutorial, we are going to show you how to scrape hotel data from Tripadvisor.

 

 

For Tripadvisor scraping, you could use our ready-to-use Task Template available on the home page or follow this tutorial to build the task from scratch.

tripadvisor template

 

 

To demonstrate, we will use this URL as an example: https://www.tripadvisor.com/Hotels-g186338-London_England-Hotels.html

 

Here are the main steps in this tutorial: [Download demo task file here]

1. Open Target Webpage

2. Create Pagination

3. Extract Data from Listing

4. Check the Workflow

5. Run Task and Export Data

 

 

1. Open Target Webpage

  • Paste the URL and click Start
 

 

2. Create Pagination

  • Scroll down to find the paging button (Next), click on it then select Loop click next page, also adjust Set AJAX timeout to 10s

set ajax

 

 

 

3. Extract Data from Listing

  • Click on 2 random hotel titles then select Loop click each URL

loop click each url

 

  • Click on each data need to be extracted then select Extract the text of the element, repeat until all data needed are in place

extract text

 

  • Go to Data Preview, double click to rename the field

 

 

 

4. Check the Workflow

  • Below is how the final workflow looks like, if everything is in place, you can continue to run the task

otd workflow

 

 

5. Run Task and Export Data

  • Run the task on the top right corner: Run task on your device to run the task on your local device, or select Run task in the cloud to run the task on the Cloud (for premium users only)

 

  • Here is the sample output -

 

Happy Data Hunting!

Author: The Octoparse Team

Download Octoparse Today

 

For more information about Octoparse, please click here.

Sign up today. 

We use cookies to enhance your browsing experience. Read about how we use cookies and how you can control them by clicking cookie settings. If you continue to use this site, you consent to our use of cookies.
Accept decline