undefined
Blog > Web Scraping > Post

Web Scraping: How to Get Coronavirus Data (COVID-19)

Thursday, February 13, 2020

Since the outbreak of the new airborne contagious coronavirus, the lives of millions have been impacted and relevant news has been exploding on all platforms.  

In this situation, we thought it’d be necessary to collect real-time data from both official and unofficial sources so that the public can have a fair-minded understanding of this outbreak with transparent data sources. 

 

 

To pull data from these sources, you can take advantage of web scraping tools like Octoparse as we’ve built web scraping templates to extract data on China’s government report. This can keep you updated with the latest information. Now let’s take a look at how to use the template to extract live data.

Step 1: Launch Octoparse in your computer and build a scraping task by clicking “Task Template”.

 

task template

 

Notice: There are numbers of scraping “recipes” ranging from eCommerce websites to social media channels. These are preformatted crawlers that can be used to extract data from target websites directly. You may check out this article to get a better idea of what a web scraping template is.

 

Step 2: Under the “Live” category, choose “national healthcare commission”. 

 

national health commision templates

 

 

You will see two templates. One is for extracting government news and announcement. The other is the Tencent news website, which is directly connected with China’s central and local Health Commission. This is so far the quickest method to get live data including the confirmed cases,  the recovery, death toll and fatality rate in each city of China. 

 

tencent news coronavirus real time data

 

Step 3: Click ”real-time data 2019-nCov” as we want to collect live data.

There’s no need for configuration. Simply start the extraction and Octoparse will automatically scrape the data at ease. You can export the data into many formats, such as Excel, JSON, CSV, and to your own database via API. Here's what the data output in excel looks like. 


sample coronavirus data

 

You can also extract real-time information on social media channels. There are templates covering popular platforms such as Facebook, Twitter, Instagram, and YouTube.

For example, if you want to extract the latest tweets about the virus and see how people are reacting to it, you may take advantage of the “latest tweets” template. It’s designed to collect the latest tweets containing the search keyword that you put into. It allows you to extract web page URL, tweet URL, the handlers, posts, etc.


twitter template

 

Now let’s run this template. 

Step 1: Open Twitter, type in “coronavirus” and click on the “latest” tab. Copy the URL and paste it into the first parameter. 

 

twitte coronavirus live page

 

Step 2: Enter a number into the second parameter.

Twitter applies infinite scrolling technique, which means that we have to set a scrolling number until we get the desired numbers of posts. You can set any number you like from 1 to 10,000. The idea is to get the page fully loaded. For example, if you enter the number 10, the bot will scroll 10 times.

 

Step 3: Execute the scraper by clicking “save and run” and you'll get the results instantly. 

 

latest tweets coronavirus data

 

We’ve covered how to use web scraping templates to collect real-time data about coronavirus in this video. If you also want to build your own scraper to extract articles from news portals like Wall Street Journal, New York Times, and Reuters, you may check out this video. 

 

 

This blog post was originated from our article How Data Analysis Helps Unveil the Truth of Coronavirus.

 

Artículo en español: Cómo obtener Coronavirus (COVID-19) Datos
También puede leer artículos de web scraping en El Website Oficial


Octoparse download

More Resources

 

Top 20 Web Scraping Tools to Scrape the Websites Quickly

Top 30 Big Data Tools for Data Analysis

Web Scraping Templates Take Away

How to Build a Web Crawler - A Guide for Beginners

Video: Create Your First Scraper with Octoparse 7.X

 

 

 




Download Octoparse to start web scraping or contact us for any
question about web scraping!

Contact Us Download