Step-by-step tutorials for you to get started with web scraping

Download Octoparse

Scrape real estate data on Realtor.com

Wednesday, September 26, 2018

As a realtor, the basic requirements are maintaining a good relationship with customers and following up cases in time. To execute the duties well, a realtor not only should be equipped with professional knowledge about real estate industry but master the real-time market information.

A good suggestion to help them master the real-time market changes is using Octoparse to scrape key data from real estate websites, like realtor.com. By doing that, realtors could view the information intuitively and learn the real-time market situation.

In this tutorial, we will introduce a way for you to collect the data, such as location, price, decoration style, and some related information form Realtor. The URL we use in this case is https://www.realtor.com/

 

Here are the main steps in this tutorial: [Download demo task file here ]

1) "Go To Web Page" - to open the targeted web page

2) Create a pagination loop - to scrape all the results from multiple pages

3) Create a "Loop Item" -  to loop click into each item on each list

4) Extract data - to select the data for extraction

5) Customize data field by modifying XPath - improve the accuracy of a certain data field (Optional)

6) Save and start extraction - to run the task and get data

  

 

 

 

 

 

1)  "Go To Web Page" - to open the targeted web page

 · Create the task with "Advanced Mode".

 · Paste the URL into the "Extraction URL" box and click "Save URL" to move on

  

 

 

 

 

2)  Create a pagination loop - to scrape all the results from multiple pages

 · Click "Search bar" and enter the text on the "Action Tips" panel

 · Click "Search" and then "Click Button"

 · Scroll down and click ">"( ">" is the "next page" button on this page)

 · Click "Loop click single element" on the operation panel and pagination step will be auto-created on the workflow.

 

 

 

3)  Create a "Loop Item" -  to loop click into each item on each list

 · Click each step one by one on the workflow

 · Click the item and choose "select all" on the "Action Tips" panel

 · Click "Loop click each element" and "loop item" will show on the workflow

 

 

 

 

4) Extract data - to select the data for extraction

 · Click the information you need on the page

 · Select "Extract data" on the "Action Tips" panel

 

 

 

 

 

 

5) Customize data field by modifying XPath - Improve the accuracy of a certain data field (Optional)

 · Revise the "Field name" by typing the words in the blank directly

 · If the data is not extracted properly, customize XPath to ensure you get the precise data you want.

 

To get precise information from the website, you can use this XPath //SPAN[contains(text(),"Price per sqft")]/following-sibling::SPAN[1].

 

Tips!

The related tutorials you might need are listed below

Associate Data with Nearby Text 

Locate Element with Xpath 

Data Fetched to the Incorrect Data Fields 

 

 

 

 

6) Save and start extraction - to run the task and get data

 · Click "Start Extraction" and select "Local Extraction" 

 · Extract the result you need and start analyzing for your business.

 

Was this article helpful? Contact us  anytime if you need our help!

 

 

 

 

 

Download Octoparse to start web scraping or contact us for any
question about web scraping!

Contact us Download
btn_sidebar_use.png
btn_sidebar_form.png