Step-by-step tutorials for you to get started with web scrapingDownload Octoparse
Scrape hotel data from Booking
Thursday, September 20, 2018
In this tutorial, we are going to show you how to scrape hotel information from Booking. Hotel information details such as price, address, rating, and images etc. can be easily collected by creating a scraping task in Octoparse. Landlords can capture customer reviews and ratings to analyze how to provide better service to customers, and they can learn more competing hotels in a certain price range.
To be specific, we select "New York, New York State, USA", "Sat, Dec 1, 2018 - Sun, Dec 2, 2018", "1 adult 0 children" and use the result URL for scraping.
You would better customize your demand since time and location varies. Furthermore, structure and display of Booking.com might vary depending on your IP and preferred language.
Here are the main steps in this tutorial: [Download demo task file here ]
1) Go To Web Page - to open the targeted web page
· Create a task with "Advanced Mode"
Advanced mode supports flexible configuration and complex website.
· Paste the URL into the "Extraction URL" box and click "Save URL" to move on.
2) Create a pagination loop - to scrape all the results from multiple pages
· Open "Workflow" in the top-right corner in Octoparse
We strongly suggest turn on "Workflow" mode to get a better review of what you are doing with your task just in case you mess up with the steps.
· Scroll down to the bottom and click ">" ( ">" is the pagination button on this website)
· Click "Loop click the selected link" on the "Action Tips" panel
3) Build a loop item - Loop click into each item on each page
· Click name of the first hotel in the list
· Click "Select all" on the "Action Tips" panel
Octoparse will automatically select all similar elements (links to detail page shown as the name of hotels) on the current page.
· Click "Loop click each element" to create a loop item
When running Octoparse, it will click through each link of hotels selected on the list on the current page.
4) Extract data - to select the data for extraction
After you click "Loop click each element", Octoparse will click the link and jump into detail page.
· Click on the data you need on the page like the name of the hotel, rating, address and such and click "Extract text of the selected element" to extract
· Type the new field name to revise if needed
5) Start extraction - to run the task and get data
· Click "Start Extraction" and select "Local Extraction" to start execution
Data will be automatically extracted by Octoparse.
· When the task is completed, you can export the data extracted for further analysis.
Was this article helpful? Feel free to let us know if you have any question or need our assistance. Contact us here !
- Most popular tutorials
- Scrape product data from Walmart
- Scrape product data from Flipkart
- Dealing with Infinitive Scrolling/Load More
- Scrape room listings data from Airbnb
- Scrape real estate data on Realtor.com