Scrape Octoparse that Requires Login and Load Web Content with with Ajax

Monday, April 25, 2016 4:08 AM

Welcome to Octoparse tutorial. Octoparse is a web scraping tool specifically designed for mass-gathering of various data types. If you don’ t have an account yet, please sign up at octoparse. com.

In this tutorial, I’m going to show you how to scrape a website that requires login first.

Some websites need you to login with user account and password before scraping data. Octoparse supports scraping data from websites which require authentication. While using Octoparse to scrape data from such websites please follow the steps below.


Step 1 Set Basic Information

First of, Enter the task name. Save your task to a category. Then click “next” to the second step.



Step 2 Design Workflow

Open the login page URL in the built-in browser.


Click the first textbox. Choose "Enter text value". 


Type in your account in the input box under Customize Current Action. Click ”save".


Click the second textbox and select "Enter text value" again.


Type in your password and click "save".


Then click on sign-in button. Select "click an item".


Sometimes, the elements on the web page is not all loaded at the same time. Some elements may load later. If the item that you choose to click is not loaded and the page stopped loading, you need to set Ajax timeout.

Choose "load page with AJAX", then set Ajax time out. Click "save".


Then you can start to scrape data.  Once done configuring extraction rule, click next.


Step 3 

You can choose not to load images to speed up the extraction. But sometimes may cause problems on certain websites. Then click “Next”.


Step 4 

Now the Task is completed! Choose the Local extraction to run the task on your computer.


The data extracted will be showed in this pane and we can also see the configured rule of the task. You can also check out the build-in browser to see if the task runs as expected.

Export the results to Excel files, or other formats and save the file to the computer.


You’ve seen how Octoparse extract data from the website quickly and effectively.

You’ re ready to extract data from Airbnb. Try it now!