Scrape Real Estate Data (Example:www.realtor.com)Monday, May 30, 2016 7:19 AM
Welcome to Octoparse tutorial. In this video, I’m going to show you how to access real estate data from www. realtor. com, which is an online real estate agent in the US. As a real estate agent, you might need to gather information for yourself. So let’s get started!
After setting basic information of your task, click “Next”.
Then open the page you need to scrape data from in the browser.
For instance, the Apartment for Rent category in San Francisco.
Scroll down the page to the bottom.
To do the pagination scraping, you need to create a “Loop to keep clicking next page”. When you click on “Next page” button here, You can’t find the “Loop click next page” option as usual.
In this case, you need to configure pagination scraping rule in another way.
To do this, drop an “Loop” item into Workflow designer. Choose a “Loop Mode” under “Advanced Options”. Select “Single Element Option”.
Make sure you locate the right place of the pagination link.
Then click the X path, and paste it in the text box.
Next, drop a “Click element” action into the “Loop item”
Choose “Click Loop items” under Advanced Option.
Now you’ve configured pagination crawling.
Then create a list of item as usual.
Click on the first title > Select “Create a list of items” > Add current item to the list > Continue to edit the list.
Click on the second title > Add current item to the list > Finish creating list > Click Loop to process
Now you’re on the detail page.
Then start scraping data you need. Click on the name. Choose “Extract text.”
You can also extract other information in this way.
To get rid of the dollar sign, select the field you want to reformat. Click the Customize Field button. Choose “Reformat extracted data”. Click “Add step”. Click “Replace strings”.Copy the dollar sign and paste it into the “Replace” box.
Don’t type anything in the “With” box. Click “Calculate”. And the dollar sign will be removed. Then click “OK”. Now the final output data has no dollar sign. Click “done”.
Now there’s no dollar sign in the data you captured.
Once done configuring scraping rule, click “Next”.
Now choose “Local extraction” to run the task on your computer.
If the data you want to scrape is huge, choose Cloud Extraction to run your task in the cloud.
The data extracted will be showed in this pane and we can also see the configured rule of the task. You can also check out the build-in browser to see if the task runs as expected.
Export the results to Excel files, or other database formats, and save the file to the computer.
You’ve now known how to collect real estate data.Download Octoparse and try it now!
If you like this video, please thumb up and subscribe our channel.
If this video tutorial is not available for you, you can click here to see the corresponding graphic tutorial.