Amazon Review Analysis

Hi there!

Today we will discuss how to analyze Amazon review.

In this analysis, we will find overall score of a review and decide if the  review is positive or negative.

We will use 3 files in this program.

One file contains all possible words in the review which are treated as positive.

Another file contains all possible words in the review which are treated as negative.

And one more file with possible keywords (for ex: If it is a mobile phone review, then possible keywords are screen, battery, camera etc).

#######################################################################

#read the dictionary files

#######################################################################

pos = scan(‘positive-words.txt’,

what=’character’, comment.char=’;’,sep = “\n”)

neg = scan(‘negative-words.txt’,

what=’character’, comment.char=’;’,sep = “\n”)

key = scan(‘key-words.txt’,

what=’character’, comment.char=’;’,sep = “\n”)

#you can add more words to the list

pos.words = c(pos, ‘awsm’)

neg.words = c(neg, ‘wait’, ‘lol’)

key.words = c(key, ‘graphics’)

#######################################################################

#function to calculate sentiment per line

#here we will pass the review and dictionary words to the function. function will break reviews into #tokens and calculate number of occurrences of the dictionary words. If the dictionary file is positive words dictionary, then return value is positive score of the review

#######################################################################

score.sentiment = function(sentences, dic.words, .progress=’none’)

{

require(plyr)

require(stringr)

 

scores = laply(sentences, function(sentence, dic.words) {

#clean the data

sentence = gsub(‘[[:punct:]]’, ”, sentence)

sentence = gsub(‘[[:cntrl:]]’, ”, sentence)

sentence = gsub(‘\\d+’, ”, sentence)

 

# and convert to lower case:

sentence = tolower(sentence)

 

# split into words. str_split is in the stringr package

word.list = str_split(sentence, ‘\\s+’)

# sometimes a list() is one level of hierarchy too much

words = unlist(word.list)

# compare our words to the dictionaries of positive & negative terms

dic.matches = match(words, dic.words)

 

dic.matches = !is.na(dic.matches)

 

# and conveniently enough, TRUE/FALSE will be treated as 1/0 by sum():

score = sum(dic.matches)

 

return(score)

}, dic.words, .progress=.progress)

 

scores.df = data.frame(review=sentences, Senti_Score=scores)

return(scores.df)

}

#######################################################################

#function to fetch only important reviews

#here we are passing important keywords along with review. function will calculate how may #keywords are found in particular review and return the matrix. We can eliminate unnecessary #reviews by looking at their importance score

#######################################################################

impReviews = function(sentences, key.words, .progress=’none’)

{

require(plyr)

require(stringr)

 

scores = laply(sentences, function(sentence, key.words) {

#clean the data

sentence = gsub(‘[[:punct:]]’, ”, sentence)

sentence = gsub(‘[[:cntrl:]]’, ”, sentence)

sentence = gsub(‘\\d+’, ”, sentence)

 

# and convert to lower case:

sentence = tolower(sentence)

 

# split into words. str_split is in the stringr package

word.list = str_split(sentence, ‘\\s+’)

# sometimes a list() is one level of hierarchy too much

words = unlist(word.list)

# compare our words to the dictionaries of positive & negative terms

key.matches = match(words, key.words)

 

key.matches = !is.na(key.matches)

 

# and conveniently enough, TRUE/FALSE will be treated as 1/0 by sum():

score = sum(key.matches)

 

return(score)

}, key.words, .progress=.progress)

 

scores.df = data.frame(review=sentences, Imp_Score=scores)

return(scores.df)

}

#######################################################################

#test data

#######################################################################

freeText1 = “Xiaomi played a Trick here but i am not sure if it would work. This phone is Actually The Redmi Note 4G, but with a Different name and an extra Sim Slot….What is New in this then ?? Same 1 year Old Model

I dont understand WHY to launch an already Discontinued devices when you have a Lot of devices (Mi 5 which might never launch i guess)

Xiaomi India is taking credit of it but i dont see anything NEW as such in the phone.

The Company is NOT launching any good phone now like REDMI NOTE 3 & REDMI NOTE 3 PRIME , due to Legal issues.

You launch Good devices in CHINA and launch such devices which dont sell there anymore ,to dispose off in India.

I WOULD NOT RECOMMEND THIS DEVICE. And customers should make it clear to such brand that there are Many other Brands Which we can Opt. Its not that Only Xiaomi is the One in Market.

If INDIAN Customers have given Xiaomi that Market BOOST , They can take that Back too.

Atleast it should keep in Mind that INDIAN users are Not to be Served an OUTDATED phone. We want better specifications too which have been launched worldwide.

This launch by Xiaomi shows they just want to OUTSTOCK their Old phones , and surprisingly customers are happy with this also.

I dont find any good reason to Buy this phone being an model 1 year Old , just a Big Publicity Launching would Not make it go Far..!!

THIS IS OUTDATED ….Not recommended at All !!!

LIKE THIS COMMENT TO SEND A MESSAGE THAT EVEN WE WANT UPDATED PHONES WHICH ARE LAUNCHED WORLDWIDE AND SUCH DEVICES ARE NOT ACCEPTED BEING OUT OF MARKET TRENDS…!!!”

freeText2 = “hi Dear 5 star Keyboard warriors .Please read my reviews and give expert advice . I brought this phone last week and in 3 days these are the defects i found

1 . ii have attached the screen shot for reference . This phone has a media server app that takes almost 70 % of your battery . This app cannot be force stopped (this app is used to scan all media files and refresh in your gallery). So if you charge your phone 100 % it will be 65 % in jus half and hour even on standby because of this app. And the back panel of phone heats so much that its is very useful during winter to keep you warm :P. Trust me my pant gets warm as soon i slide my phone in pocket in jus 3 minutes.Mounting External SD card will make the media server app go worse . So solution is dont add any files on your phone to keep media server quite and get high battery life..

2 The apps cannot be moved to memory card so you have to use the only 11 gb space available in phone for apps (half will be consumed by whats app 😛 ) . and don even think of rooting the phone ,if you root by seeing the you tube videos (MOST ARE ONLY FOR REDMI PRIME /REDMI NOTE 4G) .If u root it any ways you wont get any updates to install.

3 there is no search option in music player provided in phone , so if u like that one song among 700 songs you have to scroll way down to get that one song to listen.So if you install any other music app which has search option, it will only scan your internal storage songs not external storage songs.

I have reported all these bugs to xiaomi , still no action taken ..oooops i forgot why will they take any action .. i already got scammed with 8500 rs by them 😛

online chat and support numbers also dont respond 😀 only updates and fixes can save my phone.

this is all i could find the most non user friendly thing in this outdated phone in 3 days . will post more on this .

OVERALL I HAVE MADE YET ANOTHER STUPID DECISON IN MY LIFE AND STILL LAUGH ABOUT IT :D”

freeText3 = “Please do not purchase this product. Too much heat generation on while using net and calling. Not user friendly. Not getting proper connectivity When comparing to other brands”

sample = c(freeText1,freeText2,freeText3)

#######################################################################

#call all functions

#######################################################################

result_senti_pos = score.sentiment(sample, pos.words)

View(result_senti_pos)

sample output:

result_senti_neg = score.sentiment(sample, neg.words)

View(result_senti_neg)

sample output:

result_imp_count = impReviews(sample, key.words)

View(result_imp_count)

sample output:

We will discuss about spam filtering in the next post.


Thanks for visiting my blog. I always love to hear constructive feedback. Please give your feedback in the comment section below or write to me personally here.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s