Since Korea confirmed the first case of Coronavirus on January 20th 2020, the total number of infected has reached 7,869 as of March 12nd. Although this pandemic outbreak shows signs of being contained in the country, it’s still uncertain how long it will take before we completely beat the coronavirus.
At this point, we think it’s necessary to present the outbreak with a more interactive visual map. The goal of this article is to show you how to leverage a web scraping and visualization tool to achieve a few fundamental steps of data analytics work.
The article consists of two parts: we first use Octoparse to extract web data, then we use Finereport to visualize the data. If you are new to a web scraping tool, feel free to click this video to learn how to extract data from scratch. If you don’t know anything about coding, don’t worry, we make the article super easy to understand.
Part One: Data Extraction with Octoparse
First, we need to scrape updated data about Korea from the CDC. Feel free to practice by yourself with the crawler file.
Data extraction consists of 3 steps:
- Step 1: Build a scraper task by entering the URL
- Step 2: Click to extract the web data
- Step 3: Execute the scraper task
Sounds easy, right? Well, it is very easy!
First, paste the web URL to the box after clicking the “Advanced Mode”. Then click “Save URL” to proceed. The web page will be loaded in the built-in browser for you to click and extract.
Next, click on any table cell and follow the guide that appears on the “Action tips” Panel. Choose “Select all sub-elements”, then click “Select all”. Congratulations! We just successfully created a scraper. Now we should confirm the step by clicking “Extract data in the loop”.
Last but not least, click to execute the scraper.
As we complete fetching the data, we can export it to an excel format and use it to create a map visualization. I understand how data can be easily outdated over time, especially for time-sensitive data. At this point, you can take advantage of its scheduler to put your task on autopilot.
Part Two: Data Visualization with Finereport
First, click “plus” button from the menu bar to select and import the file we just collected. And you can inspect its accuracy from the “preview” window. This is a necessary step that many people are likely to ignore. We’re working with the geolocation and corresponding data. If FineReport can’t read the dimension as geographical information, it will fail to create a respective map. Our data looks fine. Get our map now.
To add a map layer, click “edit” and select “Korea”. Boom! Your map appears! And it looks great! Now we need to get the points on the map to demonstrate the level of severity of each geolocation we collected. To do this, click “data” to connect case numbers with each geolocation.
We still need to make some final tweaks to make it pretty. Change the color and edit the format. Then refresh the screen.
Now let’s create a visualization map to display the outbreak successfully. Besides that, I also create data tables, bubble charts, line charts and so on. What I did was resemble them together into one dashboard.
The best part is that we can make this dashboard live by importing the data via an API, and this is achievable with the Octoparse.