undefined

Scrape Emails from Facebook Pages

Wednesday, September 28, 2016 9:05 AM

For the latest tutorials, visit our new self-service portal. Sharpen your skills and explore new ways to use Octoparse.

 

Facebook is one of the largest data "treasure troves" on the internet. However, sometimes you may feel a little overwhelmed by the amount of information shown on the same page and have difficulty finding the exact information you want. Let's say we want to scrape Emails from Facebook account pages. How can we effectively scrape web emails using Octoparse? In this tutorial, we will teach you how to quickly extract email addresses by using the built-in RegEx tool.

 

Step 1. Create a task with a sample URL

Every workflow in Octoparse starts by telling Octoparse a web page to start with.

 

Step 2. Extract email addresses

Once taken to the target webpage, you will notice the email address on the page. Of course, we can extract the text directly by clicking on them and selecting Extract text of the selected linkBut what if there is too much information and you could not find the detail you want quickly? In that case, you could use the RegEx Tool alternatively. Follow the next few steps.

  • Click the Go to Web Page action in the workflow
  • Hover over the Data Preview section and click add a custom field icon to add a custom field
  • Select Page-level data and then HTML source code

 

Extract HTML source code

 

  • Click on the three dots of the source code data field and select Clean data
  • Click +Add step and select Match with Regular Expression

 

clean data

 

Tip!

If you know how to write a regular expression, you can write a Regular Expression to match the email address directly. Check out this article to learn more. 

 

  • If you are not sure how to write a regular expression, you could try the built-in RegEx tool
  • The email address we need start with mailto: and end with " role 

 

try regex tool

 

  • Click Generate > Match > Apply to save the settings

You can copy the source code and paste it into a text editor. Search for “@” to locate the email address.

 

Step 3. Run the task to get the data

Run the task either on your local machine or in the cloud.

 

Happy Data Hunting!

Author: The Octoparse Team

Download Octoparse Today

 

For more information about Octoparse, please click here.

Sign up today. 

30 Free Web Scraping Software

Collect Data from Amazon

Top 30 Free Web Scraping Software

- See more at: http://www.octoparse.com/tutorial/pagination-scrape-data-from-websites-with-query-strings-2/#sthash.gDCJJmOQ.dpuf
We use cookies to enhance your browsing experience. Read about how we use cookies and how you can control them by clicking cookie settings. If you continue to use this site, you consent to our use of cookies.
Accept decline