Step-by-step tutorials for you to get started with web scraping

Download Octoparse

Scrape hotel data from Booking

Thursday, September 20, 2018

In this tutorial, we are going to show you how to scrape hotel information from Booking. Hotel information details such as price, address, rating, and images etc. can be easily collected by creating a scraping task in Octoparse. Landlords can capture customer reviews and ratings to analyze how to provide better service to customers, and they can learn more competing hotels in a certain price range.

To be specific, we select "New York, New York State, USA", "Sat, Dec 1, 2018 - Sun, Dec 2, 2018", "1 adult 0 children" and use the result URL  for scraping.

Tips!

You would better customize your demand since time and location varies. Furthermore, structure and display of Booking.com might vary depending on your IP and preferred language.

 

Here are the main steps in this tutorial: [Download demo task file here ]

1) Go To Web Page - to open the targeted web page

2) Create a pagination loop - to scrape all the results from multiple pages

3) Create a "Loop Item" - to loop click into each item on each list

4) Extract data - to select the data for extraction

5) Start extraction - to run the task and get data

 

 

 

1) Go To Web Page - to open the targeted web page

   · Create a task with "Advanced Mode"

     Advanced mode supports flexible configuration and complex website.

   · Paste the URL into the "Extraction URL" box and click "Save URL" to move on.

 

 

 

 

 

 

2) Create a pagination loop - to scrape all the results from multiple pages

   · Open "Workflow" in the top-right corner in Octoparse

     We strongly suggest turn on "Workflow" mode to get a better review of what you are doing with your task just in case you mess up with the steps.

   · Scroll down to the bottom and click ">" ( ">" is the pagination button on this website)

   · Click "Loop click the selected link" on the "Action Tips" panel

 

 

 

 

 

 

3) Build a loop item - Loop click into each item on each page

   · Click name of the first hotel in the list

   · Click "Select all" on the "Action Tips" panel
     Octoparse will automatically select all similar elements (links to detail page shown as the name of hotels) on the current page.

   · Click "Loop click each element" to create a loop item
     When running Octoparse, it will click through each link of hotels selected on the list on the current page.

 

 

 

 

 

 

4) Extract data - to select the data for extraction

After you click "Loop click each element", Octoparse will click the link and jump into detail page.

   · Click on the data you need on the page like the name of the hotel, rating, address and such and click "Extract text of the selected element" to extract

   · Type the new field name to revise if needed

 

 

 

 

 

 

5) Start extraction - to run the task and get data

   · Click "Start Extraction" and select "Local Extraction" to start execution

     Data will be automatically extracted by Octoparse.

   · When the task is completed, you can export the data extracted for further analysis.

 

Was this article helpful? Feel free to let us know if you have any question or need our assistance. Contact us here  !

Download Octoparse to start web scraping or contact us for any
question about web scraping!

Contact us Download
btn_sidebar_use.png
btn_sidebar_form.png