Scrape Amazon Product Reviews and Ratings for Sentiment Analysis

3 min read

Amazon is one of the leading e-commerce companies that possess customers’ data. If we analyze these customers’ data, we could make a wiser strategy to advance our service and revenue. It’s important to know what users say to your product, collecting product reviews on Amazon are one of the ways. So data sentiment analysis on reviews is becoming more and more popular for Amazon shop owners. In this post, I will show you how to scrape Amazon product review dataset for sentiment analysis easily and quickly.

Scrape Amazon Product Reviews Without Coding

Nowadays, almost every kind of data on the web could be scraped. By selecting certain elements on the web and then parsing the information, you are able to get the data. Octoparse is such an easy-to-use web scraper that allows you to extract Amazon product reviews into Excel files without any coding skills. It also supports scheduled scraping, IP rotation, preset templates, CAPTCHA solving, and other functions to help you get data easily. Follow the simple steps below to have a free trial after downloading and installing it on your device.

Steps to Scrape Amazon Product Reviews Using Octoparse

Step 1. Enter Amazon page URL to create a task

Enter the target URL into the search box. And then Octoparse will open the web page in the built-in browser and start auto-detecting. After that, you can click on “Create a workflow” to build a new task.

Step 2. Customize scraping task to get data

You can customize the task to get your desired data by following the tips panel. Such as clicking on the “Next page” button and then choosing “Loop click the element” in the pop-up window to set up pagination and loop scraping.

Step 3. Export scraped Amazon product review data

Now we will begin to extract the overall reviews of your Amazon product. Just click on the “Run” button after you have previewed all data fields. You can download the scraped review data in Excel or any other format.

Octoparse also has preset scraping templates for Amazon product reviews. You can search Amazon to find the templates and preview the data samples. If you still have any questions about the above steps, you can move to Octoparse Amazon Review Scraping Guide for more details.

Sentiment Analysis for Amazon Product Reviews

Now that I’ve obtained the data, what can we do with this? Sure enough, we could read through all these reviews to see how others feel about it, but it would take quite a long time. That’s why we need sentiment analysis.

Sentiment analysis allows us to obtain the general feeling of some text. Although we could just look at the star ratings, actually they are not always consistent with the sentiment of the reviews. Sentiment is measured with three different values: a negative value represents a negative sentiment, a neutral value represents a neutral one and a positive value represents a positive one.

Here I used the sentiment tool Semantria, a plugin for Excel. Semantria simplifies sentiment analysis and makes it accessible for non-programmers. I export the extracted data to Excel (see the results below).

sentiment-analysis

I would only analyze the first 100 reviews to show you how to make a simple sentiment analysis here. Here are the results:

amazon data analysis

The column “Document Sentiment +/-” gives me the overall sentiment of each review, telling me whether it’s positive, negative or mixed. The column “Document Sentiment” gives the numerical values to tell me how positive or negative each review is.

The information could be displayed in a more user-friendly way by creating a column chart.

sentiment analysis for amazon data

By calculating the Document Sentiment Value, you could find that the positive perceptions around value are 26.89, much higher than other perceptions value, comparing the neutral value 0.54, mixed 0.70 and negative -1.79. Considering the overall rating star 4.4 of the movie Me Before You, the values among different perceptions are highly consistent despite small differences.

amazon review sentiment analysis

To confirm that, I further look for the phase sentiment value. Let’s take a closer look.

Phrase Sentiment

Phrase Mentions Sentiment +/-

 

Rating

negative

neutral

positive

Sum

2.0

-0.563729823

0.392652005

0.600000024

0.428922

4.0

-14.94552305

6.095596494

15.26827288

6.418346

5.0

-31.15602022

38.07776087

131.7180169

138.6398

Sum

-46.6652731

44.56600937

147.5862898

145.487

You can see here there is a major consistency between stars and sentiment, though the rating star 5.0 has the highest negative value. But this may be resulted by the overall number of the rating 2.0.

By comparing the distribution of the rating, you could find the average star rating is distributed around 5.0 (positive sentiment), which further confirms the high consistency between stars and sentiment.

amazon product rating data analysis

The above method obviously is a simple approach, and there are a number of other widely known methods of sentiment analysis like machine learning. Also, this method isn’t limited to movie reviews. It could be applied to a range of other scenarios. And you could create a much more in-depth analysis.

Hot posts

Explore topics

image
Get web automation tips right into your inbox
Subscribe to get Octoparse monthly newsletter about web scraping solutions, product updates, etc.

Get started with Octoparse today

Related Articles