How to Extract Information from LinkedIn?

Tuesday, April 19, 2016 6:28 AM

Brief

In this tutorial, I will take LinkedIn for example to show you how to extract information and export data using Octoparse.  

 

List features covered

  • Log in to scrape
  • Export data

 

Now, let's get started!

Step 1. Start a new task  

  • Choose “Advanced Mode” and click “Start”.
  • Complete the basic information.
  • Click “Next” to proceed to extraction setup.

 

Step 2. Design Workflow

  • Sign in your LinkedIn account in the build-in browser. 
  • Click the first textbox
  • Select “Enter text value”

 

  • Type in your account in the input box under Customize Current Action
  • Click “save” button.

 

  • Then click the second textbox
  • Select “Enter text value” 

 

  • Type in your password right here
  • Click “save”

 

  • Click on sign in button
  • Select “Click an item”

 

Now you have signed in your LinkedIn. Next, drop an “open page” action into workflow designer and paste the search result page URL in the textbox. (I search Microsoft.)

 

Step 3. Set up pagination

Now you’ve opened the result page in Octoparse. Then configure pagination action.

  • Scroll down the page to the bottom
  • Click on “Next ”
  • Select “Loop click the element”

 

Step 4. Build a list of items 

We know that information such as email, address is on detail pages. So we need to get into detail pages by creating a list of item.

  • Click on the name
  • Create a list of items
  • Add current item to the list
  • Continue to edit the list.

 

  • Click on the second one
  • Add current item to the list
  • Finish creating list
  •  Loop

 

Step 5. Extract data

  • Next extract data
  • Click on the name
  • Select “extract text”

By clicking on the data you want to scrape, then selecting extract text, you can scrape any information on this page.

 

Step 6. Rename data fields

All the content will be selected in Data Fields. Click the "Field Name" to modify.

 

Next, drag the second "Loop Item" before “Click to paginate” action.

 

Once done configuring extraction rule, click “ Next”.

 

Step 3 Extraction Options

You can choose not to load images to speed up the extraction. But sometimes may cause problems on certain websites. ➜ Click “Next”.

 

Step 4 Done

Now the Task is completed! Choose the Local extraction to run the task on your computer.

The data extracted will be shown in "Data Extracted" pane. Click button to export the results to Excel file, databases or other formats and save the file to your computer. You can check out the built-in browser to see if the task runs as expected.

 

 

 

 

Happy Data Hunting!

 

 

 

 

 

 

Author: The Octoparse Team

 

 

 

Download Octoparse Today

 

 

 

 

For more information about Octoparse, please click here.

Sign up today.

 

 

Author's Pick

  

Octoparse Smart Mode -- Get Data in Seconds

Get Started with Octoparse in 2 Minutes

Smart Mode No Coding No Training

Scrape Job Postings from Glassdoor 

Scrape Job Postings from Indeed.com

Scrape Job Postings from Monster.com

Scrape Content Details from Freelancer.com

Get Updated Data with Clicks

 

 

 

 

 

btn_sidebar_use.png
btn_sidebar_form.png