Booking a vacation has never been easier – or more difficult. We have the tools to seamlessly book flights, accommodation, car hire and tours at the click of the button. But we also have paralyzing amounts of information to sort through and assess when choosing when to go, where to stay and what to do.
Fortunately technology is on our side. With sentiment analysis, we can instruct computers to “read” and analyze all of those hotel reviews, top ten lists and forum posts for us. They’ll identify key terms, categorize them and determine whether the author feels positively or negatively about them – saving us from reading thousands of comments. This process is also called “opinion mining”, and it’s a way to gauge how someone feels about a particular topic, person, thing or experience.
Clearly the applications for sentiment analysis go well beyond giving us a snapshot of how people feel about a particular hotel in a far-flung foreign destination. Businesses use it to determine how the market feels about its products or services – and revise their strategies accordingly. Other groups and organizations can use it to identify trends and moods in the social and political spheres. Sentiment analysis allows us to “dig deep” into a customer’s mind and what they actually want.
We know that sentiment analysis can make short work of reviews, so let’s see how it can help in selecting a hotel for my forthcoming vacation. The hotel that has caught my eye is the Bellagio in Las Vegas, but I want to make sure that it lives up to expectations before making a booking.
Step 1: Scraping the reviews
First I need to scrape the reviews found on the hotel’s TripAdvisor page. To do this I’m going to use a simple point-and-click web scraping tool called Octoparse. Octoparse features clear as-you-go instructions and requires no coding knowledge.
After some simple configuration, I started the extraction.
(Octoparse data extraction processing)
Step 2: Exporting the data
Now that I’ve scraped or “extracted” the data, I’m going to export it to Excel. Once there, I’m going to open the file and run my analysis using a plugin for Excel 2013 called Semantria. Created by Lexalytics, Semantria simplifies sentiment analysis, making it accessible even for non-programmers.
Here are the results:
The column headed “Document Sentiment +/-” gives me the overall sentiment of the each review, telling me whether it’s positive or negative. The two columns to the left of us give me the numerical values for that sentiment: basically, how positive or negative each review is. I can also explore the sentiment levels associated with various aspects of the hotel, such as staff attitude or the price of a room.
Step 3: Visualizing the data
I have my analysis, but I want to create a more user-friendly way of displaying it. To do this I’m going to use a tool called Tableau, which integrates with Excel and just about any on-premises or cloud-based software. When I plug the values from the numerical Data Sentiment column into Tableau, I get the following:
I can also create visualizations of the particular phrases that influenced those scores, as well as how those phrases are categorized:
From this I can see that the consensus is that Bellagio is a “beautiful” venue looked after by “friendly and “helpful” staff. Considering this alongside the positive perceptions around value, I think it’s time to book my room at the Bellagio. If the scores were a little lower, perhaps I’d go through the same process using reviews from several different hotels and then picking the hotel with the highest overall sentiment.
The above approach obviously isn’t limited to hotel reviews. We can take the same process and apply it across a range of scenarios. And we can also create much more in-depth, granular analyses using complex and wide-ranging texts. So if you come across a situation where you need to consider the sentiment of a large number of text-based data points, consider applying sentiment analysis. It may help you book your next trip, help you pivot your product, or even predict where the stock market is heading.