How to Extract Data from Facebook
Thursday, March 31, 2016 10:23 PMFor the latest tutorials, visit our new self-service portal. Sharpen your skills and explore new ways to use Octoparse.
It can be used to better understand your audience for business and political gains. You can also collect posts of users or in groups and comments to carry out a sentimental analysis.
With Octoparse, you can easily get post info from Facebook by using Octoparse templates. There is no need to configure scraping tasks.
Just input the keywords/URLs and wait for the data to be scraped. For further details, you may check it out here: Task Templates
You may want to use this URL as an example:
https://www.facebook.com/cnn/
Here are the 5 main steps in this tutorial [Download task file here]
1. Go to Web Page - to open the target website
3. Auto-detect web page - to create a workflow
4. Modify the XPath of the "Loop Item"
5. Run your task - to get the data you want
1. Go to Web Page - to open the target website
- Enter the URL on the home page and click "Start"
Octoparse will automatically load the page in the built-in browser and you will find a login page.
2. Log into Facebook
- Toggle on Octoparse's Browse mode
- Fill out the log-in page with your user name and password and click "Log In"
- Toggle off the Browse mode
Tip: If you would like to log in to see more information or discover that the login steps should be included in the workflow to help run the task successfully, please follow this tutorial to see how to log in to a website in Octoparse: |
3. Auto-detect the web page - create a workflow
- Click "Edit" under "Add a page scroll"
- Set to scroll to the bottom, repeat 20 times, wait time as 5s
- Rename or delete fields in the Data preview if needed
- Click on "Create workflow"
4. Modify the XPath of the "Loop Item"
- Click on the "Loop Item" action
- Make sure the "Variable List" is in loop mode
- Enter the Xpath //div[@role="article"][not(@aria-label="Comment")]/../..
- Click "Apply" to save the settings.
Tip: XPath plays an important role in locating the correct elements in Octoparse. You can check the tutorial below to learn more about it: |
5. Run your task - get the data you want
- Click "Save" to save the task first
- Then, click "Run" on the upper left side
- Select "Run task on your device" to run the task on your computer, or select “Run task in the Cloud” to run the task in the Cloud (for premium users only)
Here is the sample output.
Happy Data Hunting!
Author: The Octoparse Team
For more information about Octoparse, please click here.
Sign up today.