Web Scraping Tutorial: Scrape Websites That Require Login

Wednesday, September 28, 2016 6:13 AM

In many occasions, login is required to access the data needed. In this tutorial, I will take ebay for an example to show you how to scrape websites that requires login.

 

Step 1: Navigate to the target URL

  • Enter the URL into the build-in browser (URL for the example: https://signin.ebay.com/ws/eBayISAPI.dll?SignIn&ru=http%3A%2F%2Fwww.ebay.com%2F)

 

Step 2: Enter username and password

  • Click any where on the text box for email/username, when prompted, select “Enter text value”
  • Input your account information into the text box for “Enter text”
  • Click “Save” (Now, see how your account information is synched to text box on the webpage)
  • Enter password by following the same steps

 

 

 

Step 3: Sign in

  • Click “Sign in”, when prompted, select “Click an item”.

(Now you have logged-in and can proceed to scraping the data needed)

 

 

Step 4: Create a list for the items to be extracted

  • Click on the first item of the list, when prompted, select "Create a list of items"
  • Select "Add current item to the list"

(Now, the first item has been added to the list, we need to  finish adding all items to the list)

  • Click "Continue to edit the list"
  • Click on the second item with similar layout
  • Select "Add current item to the list"

(Now you should have all items  added to the list)

  • Click "Finish Creating List"
  • Select "loop" to have Octoparse to click on each item of the list one by one 

(As the detailed page for the first item for the first item in the list, we can now proceed to extract the detailed information about the specific item)

  • Click on the desired text, when prompted, select "Extract Text"
  • Continue with all data needed
  • Rename the fields if necessary
  • Click "Save"

Click the first item ➜ Create a list of sections with similar layout. Click "Create a list of items" (sections with similar layout). ➜ "Add current item to the list".

Then the first item has been added to the list. ➜ Click "Continue to edit the list".

Click the second item ➜ Click "Add current item to the list" again. Now we get only 4 items from the page. ➜ Click "Continue to edit the list". ➜ Click the last item ➜ Click "Add current item to the list" again. Now we get all the items from the page. 

Then click "Finish Creating List" ➜ Click "loop" to process the list for extracting the elements in each page.

- See more at: http://www.octoparse.com/tutorial/scrape-data-from-multiple-web-pages-example-medline/?category=#sthash.E3qlhysa.dpuf

Click the first item ➜ Create a list of sections with similar layout. Click "Create a list of items" (sections with similar layout). ➜ "Add current item to the list".

Then the first item has been added to the list. ➜ Click "Continue to edit the list".

Click the second item ➜ Click "Add current item to the list" again. Now we get only 4 items from the page. ➜ Click "Continue to edit the list". ➜ Click the last item ➜ Click "Add current item to the list" again. Now we get all the items from the page. 

Then click "Finish Creating List" ➜ Click "loop" to process the list for extracting the elements in each page.

- See more at: http://www.octoparse.com/tutorial/scrape-data-from-multiple-web-pages-example-medline/?category=#sthash.E3qlhysa.dpuf

Click the first item ➜ Create a list of sections with similar layout. Click "Create a list of items" (sections with similar layout). ➜ "Add current item to the list".

Then the first item has been added to the list. ➜ Click "Continue to edit the list".

Click the second item ➜ Click "Add current item to the list" again. Now we get only 4 items from the page. ➜ Click "Continue to edit the list". ➜ Click the last item ➜ Click "Add current item to the list" again. Now we get all the items from the page. 

Then click "Finish Creating List" ➜ Click "loop" to process the list for extracting the elements in each page.

- See more at: http://www.octoparse.com/tutorial/scrape-data-from-multiple-web-pages-example-medline/?category=#sthash.E3qlhysa.dpuf

 

Step 5: Starting running your task

  • Click “Next”
  • Select “Local Extraction”
  • Click “OK” to run the task on your computer.

(Octoparse will automatically extract all the data selected. Check the "Data Extracted" pane for the extraction progress)

  • Click “Export” to export the extracted data to any formats of our choice, or to any databases 

 

 

 

Author: The Octoparse Team

Download Octoparse Today

For more information about Octoparse, please click here.

 

Author's Picks

Octoparse Smart Mode -- Get Data in Seconds

Get Started with Octoparse in 2 Minutes

Collect Data from LinkedIn

Collect Data from eBay

Top 30 Free Web Scraping Software

 

30 Free Web Scraping Software

 

Collect Data from Amazon

Top 30 Free Web Scraping Software

- See more at: http://www.octoparse.com/tutorial/pagination-scrape-data-from-websites-with-query-strings-2/#sthash.gDCJJmOQ.dpuf

 

Request Pro Trial Data
Collection
Service
Email
us

Leave us a message

Your name*

Your email*

Subject*

Description*

Attachment(s)

Attach file
Attach file
Please enter details of your issue and we will get back to you ASAP.
× get my coupon now No Thanks