Get to Know Your Users Through Data AnalysisMonday, May 8, 2017
Users may exhibit different behaviors and show their preference towards videos of various types on a social media platform. For a media platform, these data coming from users proves very important to make a more comprehensive assessment. Thus, it's important for a media platform to dig into what are sought after by its users and how to get more insights from these users to offer more pleasant service and generate more sales leads.
Popularity Assessment based on Public Opinions
The assessment of videos is inherently subjective, since the affect it can have on its users vary from mildness to repulsion, from pleasantness to heart-rending. The theory alone cannot accurately determine whether a video is popular or not, factors like public mood and current trends are also influencing how the public perceives a video. We can’t create an accurate picture of popularity trend without gathering a wealth of info on the videos from the public.
While the entertainment industry is heavily based on popular opinions, it can be difficult to assess a video’s popularity without depending on media video sharing platforms, such as Youtube, Pandora, Billboard Top 100, etc. For example, we could find out top popular Youtube users through a comprehensive ranking analysis combining subscribers, views and etc.
Getting to Know Users' Likes and Dislikes
These video medias are providing adequate information about popularity, featuring top tracks and interactions, as most, if not all, social media platforms allow their users to voice their likes or dislikes about the videos with the motivation of further mining the users’ data used for prediction analysis. Take a look into Youtube, users are enabled to voice their opinions about their taste on the videos by giving a thumb up or down, and they could subscribe what they like and receive the push info messages periodically. Some regular users may find that Youtube are recommending some similar or related videos for them while they are watching certain videos. Actually, what you are interested, what you have subscribed and all of your user behaviors, are already explored by Youtube for deep learning. They could estimate a relatively objective score based on the users data and recommend most relevant videos to users accordingly using a trained model backed up by data analysis.
To get the most of the user data, three main parts are concerned within Youtube:
Finding an Easy Way to Scrape Data
With that said, we can see how important the users data gleaned from media sources are serving for mining analysis. We can extract data of varied traits and attempt to gauge the popularity of videos through different features, like thumbnails, subscription, view history and others.
In this post, I’d share with you an easy way to scrape the data from a website, like Youtube. With this method, users will not bother to write any code or deal with some complex and messy configurations, since what I propose in this post is an automated web scraper - Octoparse, which will make data scraping available for any one.
Octoparse is a free and powerful website crawler used for extracting almost all kind of data you need from the website. You can use Octoparse to rip a website with its extensive functionalities and capabilities. Its point-and-click UI allows you to grab all the text from the website as the figure shown below.
Octoparse Cloud Extraction refers to the process of retrieving data on a large scale through many cloud servers, based on distributed computing 7/24. After downloading the App, you can open a new task, configure a workflow/rule for the task, and perform the task with Cloud Extraction by putting it to the cloud. Then you can turn off your machine and let Octoparse do the work.
More advanced, it has provided Scheduled Cloud Extraction which enables users to refresh the website and get the latest information from the website. As shown below, users can schedule a specific time to run their tasks and speed up their tasks using cloud services.
Sometimes, users need a more clean and succinct data result format. To normalize data fields and remove the redundancy of regular expressions, users could use the RegEx Tool provided by Octoparse. This built-in tool will help users to re-format the extracted data by generating Regular Expression automatically based on users' formatting requirements.
Users could extract many tough websites with difficult data block layout using its built-in Regex and XPath tool, since users could locate web elements precisely using the XPath configuration tool provided by Octoparse.
Users will not be bothered by IP blocking any more, since Octoparse offers IP Proxy Servers that will automates IP’s leaving without being detected by aggressive websites.
After completion of data extraction, users can download almost all the website content and save it as a structured format like EXCEL, TXT, HTML or your databases.
After completing the data scrape, you can utilize the data extracted for various analysis, such as we could figure out the percentage of different types of art works uploaded to Youtube after a data purge. Then visualize the final result as a pie chart as below.
For more detailed steps to scrape data from Youtube, please check out the case tutorial:
To learn more about data analytics, you can check out the related topics below: