How to Extract Information from LinkedIn?Tuesday, April 19, 2016 6:28 AM
In this tutorial, I will take LinkedIn for example to show you how to extract information and export data using Octoparse.
List features covered
Now, let's get started!
Step 1. Start a new task
Step 2. Design Workflow
Now you have signed in your LinkedIn. Next, drop an “open page” action into workflow designer and paste the search result page URL in the textbox. (I search Microsoft.)
Step 3. Set up pagination
Now you’ve opened the result page in Octoparse. Then configure pagination action.
Step 4. Build a list of items
We know that information such as email, address is on detail pages. So we need to get into detail pages by creating a list of item.
Step 5. Extract data
By clicking on the data you want to scrape, then selecting extract text, you can scrape any information on this page.
Step 6. Rename data fields
All the content will be selected in Data Fields. Click the "Field Name" to modify.
Next, drag the second "Loop Item" before “Click to paginate” action.
Once done configuring extraction rule, click “ Next”.
Step 3 Extraction Options
You can choose not to load images to speed up the extraction. But sometimes may cause problems on certain websites. ➜ Click “Next”.
Step 4 Done
Now the Task is completed! Choose the Local extraction to run the task on your computer.
The data extracted will be shown in "Data Extracted" pane. Click button to export the results to Excel file, databases or other formats and save the file to your computer. You can check out the built-in browser to see if the task runs as expected.
Happy Data Hunting!
Author: The Octoparse Team
For more information about Octoparse, please click here.
Sign up today.
If this video tutorial is not available for you, you can click hereto see the corresponding graphic tutorial.