Scrape Websites with Infinite Scrolling (Quora, Facebook,Twitter)

Thursday, April 14, 2016 4:54 AM

Scroll-down-to-refresh feature can be found in many websites such as Quora, Facebook, Twitter.

Pagination is usually needed when configuring extraction, because one page of data is not enough. So you need to add a page navigation action. I’m going to take Quora for example.

 

1. You need to navigate to the target URL. Enter the URL in the build-in browser.

2. We’re now on the search result page. (I’m searching the topic about web scraping.) 

Waiting until the page loaded, select the “Advanced Options”

 

3. Choose “scroll down to page bottom when finished loading”.  ➜ Then enter how many times you wanna scroll.

Select internal time and scroll way.➜  I choose “scroll down for one screen”. You can also choose “scroll to the end of the page”.

 

Now we’re down configuring pagination.

4Next, select the first answer. ➜ Create a list of items ➜ Add current item to the list ➜ Continue to edit the list

 

 

5. Then select the second answer. ➜ Add current item to the list ➜ Finish creating list ➜ Loop to process the list.

 

 

Now it’ll automatically repeat the selection.

6. Then you can scrape whatever you want in the answer.  Click on title to extract title. ➜ Choose “Extract text". (Extract views, answer and time.) 

 

7. All the content will be selected in Data Fields. ➜ Click the "Field Name" to modify. 

 

8. Once done configuring extraction rule, click “next”.

 

9. You can choose not to load images to speed up the extraction. But sometimes may cause problems on certain websites.

 

10. Now the Task is completed! Choose the “Local extraction” to run the task on your computer.

 

11. The data extracted will be shown in "Data Extracted" pane. Click button to export the results to Excel file, databases or other formats and save the file to your computer. You can check out the built-in browser to see if the task runs as expected.

 

The result looks pretty good.

 

 

 

 

Author: The Octoparse Team

 

 

 

Download Octoparse Today

 

 

For more information about Octoparse, please click here.

Sign up today.

  

 

Author's Picks

 

Octoparse Smart Mode -- Get Data in Seconds

Get Started with Octoparse in 2 Minutes

Smart Mode No Coding No Training

Scrape Job Postings from Glassdoor 

Scrape Job Postings from Indeed.com

Scrape Job Postings from Monster.com

Scrape Content Details from Freelancer.com

Get Updated Data with Clicks

 

 

 

 

 

Contact
us

Leave us a message

Your name*

Your email*

Subject*

Description*

Attachment(s)

Attach file
Attach file
Please enter details of your issue and we will get back to you ASAP.