undefined

Extract Information from LinkedIn Public Data 2

Monday, June 27, 2016 6:35 AM

For the latest tutorials, visit our new self-service portal. Sharpen your skills and explore new ways to use Octoparse.

 

 

In this tutorial, we will show you how to scrape the posts from LinkedIn.com. To follow through, you may want to use this URL in the tutorial: https://www.linkedin.com/search/results/content/?keywords=google&origin=GLOBAL_SEARCH_HEADER&sid=DIi

 

Here are the main steps in this tutorial:  [Download task file here]

1. Go to Web Page - to open the target webpage

2. Login to the website - to access the data

3. Auto-detect webpage - to create workflow

4. Modify the XPath of Loop Item - to locate more posts

5. Run the task - to get the data

 

1. Go to Web Page - to open the target webpage

  • Paste the URL and click Start

 

2. Login to the website - to access the data

  • Click on Sign In button and choose Click URL to go to the log in page

 

Sign_in.jpg

 

 

  • After the login page is loaded, click on the Email input box and choose Enter text 

Enter_email.jpg

 

 

  • Input the LinkedIn Email address and confirm

 

Confirm_email.jpg

 

 

  • Click the password input box, choose Enter text, input the password and confirm
  • Click Sign in button and choose Click button

 

Click_sign_in.jpg

 

 

  • Set up the AJAX timeout to 10s

 

AJAX_timeout.jpg

 

3. Auto-detect webpage - to create workflow

  • Select Auto-detect web page data

 

autodetect.jpg

 

 

  • Wait for the detection to be completed - click Edit

 

Edit_scroll.jpg

 

 

  • Click Create workflow

 

mceclip2.png

set_up_scroll_repeats.jpg

  • Go to Data preview, double click on the header to rename it, or click ... to delete a field 

4. Modify the XPath of Loop Item - to locate more posts

LinkedIn pages are quite complicated. The auto-generated XPath does not work perfectly. So we need to update the XPath.

  • Click on Loop Item and input the XPath 
  • Click Apply to confirm

Modify_loop_Xpath.jpg

5. Run task - to get the data

Note: We don't suggeset that you run the LinkedIn tasks in the Cloud because the website would detect that you are logging in with a suspicious IP.

 

Here is the sample output.

mceclip1.png

 

Happy Data Hunting!

Author: The Octoparse Team

Download Octoparse Today

 

For more information about Octoparse, please click here.

Sign up today. 

 

We use cookies to enhance your browsing experience. Read about how we use cookies and how you can control them by clicking cookie settings. If you continue to use this site, you consent to our use of cookies.
Accept decline