Sunteți pe pagina 1din 10

FakeScore: A Bayesian Approach to Spotting Fake Reviews

Arjun V. Balasingam, Yianni D. Laloudakis, and Yash H. Vyas


{arjunvb, jlalouda, yashvyas}@stanford.edu

Abstract
In this project, we approach the problem of detecting fake reviews using the text information in the
reviews along with the information about background of the person writing the review and the listing for
which the review is written. The challenge lies in representing reviews as features for a machine learning
model to classify them as fake or not fake. We propose a Bayesian approach to model the dependencies
among the review, person and listing information, to arrive at a score that quantifies the tendency of the
given review being fake. We then compare accuracy of fake review detection rates with a baseline model
and discuss performance on multiple methods that convert reviews into features.

1 Introduction

Online reviews are a key resource for customers who are searching for the ideal product or service, and for
businesses who are trying to track their performance or attract customers. Because of the impact of online
reviews on customer and business decision making, ensuring the veracity of reviews is of interest to both
customers and businesses.
However, dishonest businesses or customers personally write or hire others to write fake reviews in order to
alter the reputation of a product or service. As a result, consumers could, for instance, have an inaccurately
optimistic impression of a product if the seller deliberately clutters a site with reviews that excessively
promote a mediocre product, by comparing its performance to those of to top-of-the-line brands. Thus, the
task of separating authentic reviews from fake ones is a valued, high-impact problem. In recent months,
Amazon has cracked down on people who buy or sell fake review services and has begun reworking its ranking
algorithm to prioritize genuine reviews [1]. Our project is motivated by the scale and commercial significance
of this problem area, as well as the text and feature modeling challenges posed by the research problem.

1.1 Problem Statement

In this project, given a set of reviews, we plan to:


• Identify features that (1) capture behavioral patterns in reviewers; and (2) capture text similarity and
generality across reviews.
• Explore various approaches to relate the features and model the problem.
• Compare our feature selection and modeling approaches to state-of-the-art techniques presented in
prior work.

2 Data

The problem outlined in Section 1 applies to a broad class of review authenticity classification problems,
and the techniques we explore in this paper can be generalized across domains. However, in most cases, we
never know for sure which reviews are fake and which are not; therefore, we have no ground truth to build
supervised algorithms and quantify our accuracy.
After considering perusing a number of datasets from Amazon, Yelp, and TripAdvisor, we chose to focus on a

1
Yelp dataset containing reviews from restaurants around the country. This was one of the few datasets that
provides what are considered gold-standard labels for a problem that it is notoriously difficult to find ground
truth for. The dataset consists of 67,019 reviews written by 16,677 reviewers for 129 restaurants across
USA. Fake reviews are uniformly distribution within the dataset. From the dataset, we make a (reviewID,
reviewerID, restaurantID) triple1 for each review, and split the data into a train and test set with a 3:1
ratio. Care was taken to ensure that all reviews from a single person ended up on the same side of this split,
so that we would not contaminate our testing data with data that was trained upon. We then feed this data
into our feature extraction pipeline, which is described in Section 3.

3 Feature Extraction

We designed a modular feature extractor that builds feature vectors for each of the three components in our
data triples viz. person, restaurant, and text. In this section, we describe our feature selection approaches
for each of these components.

3.1 Person [P]

We select features that might contribute to the spamicity (tendency of being fake) of a reviewer, which we
define as the tendency for that person to write fake reviews, based on their history (as available to us from
our dataset). Performing some additional computation based on the reviewer metadata provided in our
dataset, we use the following features:
• number of reviews that person wrote
• number of people who found that person useful
• number of friends and fans that person had
• average rating that person gave to restaurants
• average number of months between reviews
Since some reviewers in our dataset did not have an associated profile, we use mean imputation to create a
feature vector. We select features that capture a reviewer’s behavioral patterns. For instance, reviewers that
consistently give either high or low ratings are more likely to be ”fake” reviewers. People that frequently
push reviews within a short burst of time are also more likely to write fake reviews.
By scaling the features and applying lasso (L1) regularization, we determined that a large value for the
following features disposed a reviewer to being non-spammy: number of reviews written, number of times
found useful, number of friends, and average rating. Interestingly, we found that the average rating feature
followed two different distributions depending on if that person was considered spammy.
Finally, we assign a spamicity score to each reviewer, since this is a component for our models discussed in
Section 5. We assign a score of 1 if a reviewer has ever written a fake review, and 0 otherwise.

3.2 Restaurant [R]

We use a similar approach to describing the spamicity of a restaurant, which like spamicity of a reviewer, is
defined as the tendency for that restaurant to have fake reviews. We featurize restaurant metadata into the
following features:
• average rating assigned to restaurant
1 To avoid confusion, we refer to these three components are text (T ), person (P ), and restaurant (R), respectively.

2
• average time between reviews
• number of reviews written about restaurant
• whether or not restaurant accepts credit cards
As with the reviewers, we assign a spamicity score to each restaurant, where a restaurant labeled as ”spammy”
if at least 10% of its reviews are labeled as fake.

3.3 Text [T]

As a first step, we extracted some simpler text features from the review text that characterize a fake review.
The label for the submodel which handles text spamicity is the same as the overall label for the review.
The simple text features include behavioral patterns observed in the dataset. From [6, 7], we find that
fake reviews are less useful, have extreme ratings, talk about general themes rather than specific incidences
and have unwanted profanities. [10] also discusses that the average length of fake review is significantly
shorter than a credible review. A credible review has a rich vocabulary while fake reviews tend to use poor
grammar and involve use of words with improper punctuation. Based on all the characteristics we create
a feature vector consisting of rating, number of people who found the review useful, number of positive
words in the review not preceded by ”not”, number of profanities, number of personal references, number of
exclamation marks, mean word length, mean sentence length, and the number of sentences. Predetermined
lists of positive words, profanities, and self-reference were used to create the features above.

4 Text Modeling

The simple text features described in Section 3.3 capture behavioral patterns in review text. In this section,
we present more sophisticated approaches to quantify similarity across reviews and identify use of general
language in the reviews.

4.1 Latent Dirichlet Allocation (LDA)

Using this method described in [4], we try to identify whether the fake reviews have a some common topics
associated with them. We fix the number of topics to be equal to 500 and use LDA to uncover the hidden
topics in fake reviews. This method would help in abstracting the rich text of reviews into a vector of known
size and in turn help in identifying whether there are any set of topics associated with fake reviews. A brief
explanation about LDA is given using Figure 1a.

4.2 Term Frequency-Inverse Document Frequency (TF-IDF)

We use this method to assign weights to the individual words used in the reviews. TF-IDF is a bag of words
where weights reflect the importance of a word vis-a-vis the document from a corpus. We eliminate rare
words that occur less than 1 percent of time in all the fake reviews. We also discard high frequency terms
that occur more than 20 percent of the time in the data. The purpose behind using this method is to identify
words that are highly associated with fake reviews. We calculate the weight for each word t in document
d ∈ D, where D is the larger document set.
First, we compute the term/word frequency (TF), where weights are proportional to the term frequencies
ft,d :
ft,d
tf(t, d) = 0.5 + 0.5
max{ft ,d : t0 ∈ d}
0

3
(a) Latent Dirichlet Allocation (LDA) (b) Word2Vec

Figure 1: [Left] Schematic of Latent Dirichlet Allocation (LDA). This technique characterizes text with a
set of topics, and builds a probabilistic model between topics and words. [Right] Schematic of Word2Vec [3].
This technique constructs a bag-of-words based on review text, and then builds a vector for each word that
contextualizes that word with respect to the rest of the document.

Then, we compute the inverse document frequency (IDF), where weights are inversely proportional to the
document frequency, as given by:
|D|
idf(t, D) = log
|{d ∈ D : t ∈ d}|

Finally, we combine the two to compute the term frequency-inverse document frequency (TF-IDF):

tfidf(t, d, D) = tf(t, d) × idf(t, D)

4.3 Word2Vec

Word2Vec [3] (Figure 1b) is a is a neural network implementation that learns distributed representations for
words. Rather than treating words as individual elements, using this technique we can represent every word
based on the context in which it appears in the document. In our model, we choose to represent each word
in the review as a vector of size 300 and account 10 words in its context. The vector associated with the
review is the average of all the words observed in the given review. The purpose of using this method for
feature extraction is to identify a consistent outline of any review and develop a characteristic theme of a
fake review.

5 Modeling Authenticity

Using the approaches discussed in Sections 3 and 4, we have been able to translate metadata and review text
associated with a particular (reviewID, reviewerID, restaurantID) triple to quantifiable feature vectors.
We can design models that use these features as inputs to classify a triple as either fake or not fake.

5.1 Notation

We define our feature vectors for a text, person, restaurant (T ,P ,R) triple as φT , φP , and φR , respectively.
We define the overall spamicity score S ∈ {0, 1}, which flags a (T ,P ,R) triple as either fake (1) or not fake
(0). Similarly, we define intermediate spamicity scores for each component of the triple: Si ∈ {0, 1} ∀ i ∈
{T, P, R}.

4
S
S

(b) Person, Restaurant, and Text


(a) Only Text Features Features

Figure 2: Baseline Models. In both cases, we predict the probability of a review having spamicity score
S = 1. Note that they are represented as simple single-node Bayesian networks with a single local conditional
probabilities; this is simply to remain consistent with the more complex networks presented in Figure 3.

We design our models to ultimately determine P(S | φT , φP , φR ), the probability that a (T,P,R) triple is
”spammy” given its features. For our models described in subsequent sections, we require intermediate
probabilities, which define using similar notation.

5.2 Baseline Models

A naı̈ve baseline would be to always predict the majority class. Since most reviews are not fake, we could,
by default, assign this label to all classifiers. Note that this would achieve an accuracy of about 87% on our
dataset; however, it is likely that the two label classes may have different sizes in the true population model.
It also needs to be noted that the sensitivity of this majority classifier is 0%.
For our first model, we train a support vector machine (SVM) on the simple text features discussed in Section
3.3. A schematic is shown in Figure 2a. Our SVM minimizes a hinge loss function, given by
L(t, y, wT ) = max{1 − (wT | φT (t))y, 0} (1)
where t is the review text, y is the predicted label, and wT is the weight vector corresponding to the simple
text feature vector. We then can compute the probability that this text has a spamicity score ST = 1 using
Platt Scaling [2].
The goal in this model is to understand if this review classification problem can be viewed solely based on
simple text features, or if it can be enhance by more complex text models and with from information about
the reviewers and restaurants.
For our second preliminary model, we trained a logistic regression based on simple text features, restaurant
features, and reviewer features. A schematic is shown in Figure 2b. Our logistic regression minimizes a
logistic loss function given by
L(t, p, r, y, w) = y log (σ(w| φ(t, p, r))) + (1 − y) log (1 − σ(w| φ(t, p, r))) (2)
where t is the review text, p is the reviewer, and r is the restaurant. The feature vector φ = [φP φR φT ]| ,
a stacked vector consisting of the individual feature vectors for the text, reviewer, and restaurant. The
corresponding weight vector is w. Logistic regression returns a probability, so the output of this regression
model would be the probability that a (T ,P ,R) triple has a spamicity score of S = 1.
In this model, we enforce no probabilistic dependence between the person, text, and restaurant. We compare
this model to the previous (simple text only) model in order to understand the impact of adding person and
restaurant metadata on the overall performance.

5.3 Bayesian Network without Interactions

Our new models largely revolve around incorporating prior information about how a reviewer, text, and
restaurant might impact the combination’s overall spamicity score S. Our first approach is shown in Figure

5
SP ST SR SP ST SR

S
S

(a) Bayesian Network without Interactions (b) Bayesian Network Capturing Interactions

Figure 3: Bayesian Networks enforcing relationship between P , R, and T variables. In both cases, we predict
the probability of a review having spamicity score S = 1. The directional relationships and quantities being
conditioned vary in both models.

3a. We consider three independent nodes SP , ST , and SR , which together, contribute to the overall spamicity
S. Note that the independence assumption SP ⊥ ST ⊥ SR is naı̈ve; we address this in Section 5.4.
To solve this Bayesian Network (i.e. compute P(S | SP , SR , ST )), we must compute four local conditional
probabilities, as shown in Figure 3a. First, we compute the probabilities of a person, text, and restaurant,
each having a spamicity score Si = 1 where i ∈ {T, P, R}. For the SP and SR models we train a logistic
regression model with loss function given by Equation 2. Since the ST model has a significantly larger feature
space (due to the more complex text modeling techniques discussed in Section 4), we train an SVM with loss
function given by Equation 1 and use linear kernels. To compute the final probability p(s | sP , sR , sT ), we
train a logistic regression model, with features being the spamicity probabilities computed in the previous
step.

5.4 Bayesian Network Capturing Interactions

In this model, we apply the insight that a review text likely depends on the person writing the review, and
the restaurant being written about. Our modified bayesian network that captures these interactions is shown
in Figure 3b.
To solve this Bayesian Network (i.e. compute P(S | SP , SR , ST )), we must again compute four local condi-
tional probabilities, as shown in Figure 3b. We compute p(sP | φP ) and p(sP | φP ) in the same way presented
in Section 5.3. For our ST model, we again train an SVM with text feature vector φT , but also include as
features the spamicity probabilities SP and ST . The final probability p(s | sP , sR , sT ), is computed using
the same procedure described in Section 5.3.

6 Implementation Details

We implemented all of our data processing, feature extraction, and model training and testing in Python.
The raw data was stored in SQL database files; we converted these to CSVs, and imported into Python as
pandas dataframes and standard Python lists. Our data processing module involved manipulation and
computation using these data structures. We then saved our final, processed data into separate CSVs for
train and test partitions.
The feature extraction phase loaded these CSV data files back into pandas dataframes and lists. Simple
features were generated by deriving quantities from the existing data (which, in code, involved computations
or searches across multiple dataframes/lists). For the more complex text features, we used Python packages,

6
Model
BN BN BN BN BN BN BN
Text SVM LR
No Interactions With Interactions No Interactions With Interactions No Interactions No Interactions No Interactions
(Behavior) (Behavior)
(Behavior) (Behavior) (LDA) (LDA) (TF-IDF) (word2vec) (word2vec)
Not Fake
75.8% 88.7% 92.8% 94.5% 93.0% 92.5% 92.5% 92.9% 92.6%
(Specificity)
Fake
46.4% 70.4% 68.7% 66.9% 69.0% 67.7% 67.6% 68.5% 67.7%
Accuracy (Sensitivity)
Type Overall
72.2% 86.5% 87.8% 91.1% 90.1% 89.5% 91.2% 89.9% 89.6%
Accuracy

Table 1: Summary of Results, for different feature sets and different modeling types.

including TfidfVectorizer for TF-IDF and gensim for LDA and word2vec. In each case, feature matrices
are saved to CSVs.
The modeling unit loads these feature CSVs, and uses the sklearn package in Python to train LogisticRegression
and svm models. All of our intermediate models save CSV files containing probabilities for each datapoint
trained or tested.

7 Results and Error Analysis

In this section, we summarize our results, based on the feature extractions methods discussed in Section 3
and the authenticity models presented in Section 5. We first characterize the performance of our baseline,
then discuss the new Bayesian network models, and conclude with an error and performance analysis.
Our results are summarized in Table 1. Note that we present specificity (true negative rate), sensitivity (true
positive rate), and overall accuracy. Our positive labels are fake reviews. We are most interested in tracking
the sensitivity, since that gives us a measure of how well our classifier performs on just the fake reviews; we
also study tradeoffs between specificity and sensitivity.

7.1 Baseline Models and Oracle

The SVM we trained on simple text features had the poorest performance, achieving an overall accuracy of
72.2%, and sensitivity of 46.4%. Thus, it correctly classifies less than half of the fake reviews in our test set.
Our logistic regression model with all person, simple text, and restaurant features as inputs performs signif-
icantly better, achieving a sensitivity of 70.4% and a specificity of 88.7%. Thus, our classifier benefits from
the addition of person and restaurant information.
We understand from [8] that the Yelp’s internal algorithm is aggressive in classifying the reviews as fake,
i.e. it is well-off for Yelp in predicting a genuine review as fake if the given genuine has characteristics
associated with fake reviews. Any algorithm that trains on Yelp’s classifier is aimed at matching Yelp’s
accuracy. However, the robustness of Yelp’s classifier ensures it to be considered an appropriate Oracle.

7.2 Bayesian Network Representation

The conditioned Bayesian Network presented in Section 5.4 makes a valid assumption that the spamicity
score of a review is influenced by both the reviewer and the restaurant being written about. We can quantify
the effectiveness of this technique. Consider, for instance, the following review, which was labeled as fake
by Yelp:
I was disappointed with my pizza. I think this place is very overrated. Im sure if i got another
pizzza with toppings of my choosing i might be a little more satisfied . But then again i liked most
of the toppings minus the mushrooms. Maybe im just not a deep dish pizza lover. The service

7
Bayesian Network without Interactions Bayesian Network Capturing Interactions

(a) Bayesian Network without Interactions (b) Bayesian Network Capturing Interactions

Figure 4: Receiver Operating Characteristic (ROC) Curves for Spamicity Models. Baseline is plotted
for reference. Each point on the curve corresponds to a (true positive rate, false positive rate) pair for a
classification model with a unique cutoff threshold probability η.

was great and the staff was super friendly. Expect the lady who was the parking valet,she was
jerk.
Our baseline model assigns this review text alone spamicity probability of 0.22. However, upon applying the
Bayesian Network that accounts for interactions by computing the spamicity probability sT conditioned on
both the text features and restaurant and person information, the probability of spamicity rises to 0.71.
Using topic models such as LDA to get features from the raw text increased the overall accuracy from 86%
to 90% in the case when interactions among the nodes SP , ST , and SR . However, there was no significant
difference in the specificity and sensitivity measures. This implies that the topics associated with fake and
credible reviews are very similar. The TF-IDF and Word2Vec models gave an higher accuracy of 91.2%
and 89.9% respectively as compared to the baseline. However, even in these cases there was no significant
increase in the specificity and sensitivity measures. The lack of significant improvement in these measures
suggests that there is no clear theme of fake reviews. The statement is corroborated by the fact that unless
a user has visited the restaurant, it is unlikely for a human to all the fake reviews with the help of just the
raw text.

7.3 Selecting a Cutoff Threshold

Our model designs were informed by the probability that a (P ,R,T ) triple has a spamicity score S = 1.
However, labeling as fake (1) or not fake (0) requires making a decision based on the computed probability
this amount. We can represent this choice using the function
1[ p(s | sP , sR , sT ) ≥ η ]
where the indicator function returns 1 if a review is fake, and 0 otherwise. We refer to η a cutoff threshold
probability.
Choosing an ideal value of η is subjective, and is a decision that businesses would make depending on how
aggressive they would like their fake review scoring software to be. Based on a literature survey [8], we
notice that a lower, more aggressive cutoff threshold η is more desirable. As noted in Section 1, customer
business is largely determined by these reviews, and the presence of a small fraction of fake reviews could
skew a customer’s impression of the product/service in the wrong direction.

8
We also take a more rigorous and principled approach to selecting η. Figure 4 shows Receiver Operating
Characteristic (ROC) curves for the two Bayesian Network models presented in Section 5. These curves allow
us to study the tradeoff between true positives and false positives. Each point on the curve corresponds to
a unique threshold η. Ideally, we would like to be at the top left corner of the plot, where the true positive
probability is 1 and the false positive probability is 0. A good threshold is one that is closest to this top-left
corner, since it maximizes the true positive rate, while minimizing the false positive rate. Incorporating prior
intuition that we would like an aggressive threshold, and based on the ROC curves, we select a threshold of
η = 0.25, which is consistent with those used in the literature [8]. The results summarized in Table 1 are
based on this value of η.

7.4 Model Performance

Model performance can be assessed in two stages: (1) efficiency of generating features from the text and
(2) efficiency of generating the probability scores based on the features generated in step (1) as input to a
logistic/SVM model. LDA takes the longest time (> 10 hours) to generate features from raw text and then
another 10 hours to get spamicity scores using SVM on 500 LDA topics. Word2Vec takes about 30 minutes
to generate the features for review text and then another 8 hours to get spamicity scores from the same row
text input using SVM on 300 dimension review vector. TF-IDF is the quickest to generate spamicity scores
with computational time of the order 2 hour to run generate the weights and run the SVM in sparse matrix
format having 1125 features in the input. We recommend TF-IDF when computational performance is a
critical element.

8 Conclusions

In this project, we built a fake review classifier, and evaluated our approaches on Yelp review data for restau-
rants. We showed the features that captured behavioral patterns from both information about reviewers and
the review text itself are extremely important. Further, formulating the model as a (review text, reviewer
information, and restaurant information) triple improved the accuracy of classified fake reviews. Our output
specificity is comparable to the current state-of-the-art methods described in [11, 10]. This is a wide-ranging
problem with high impact, and we believe our modeling techniques can generalize to classify other types of
reviews and fake news [12].

9 Future Work

In this project, we explored a variety of feature extraction and modeling techniques. However, we only
evaluated performance on labeled data from Yelp. Since this problem is extremely relevant in many other
domains, as future work, we would consider applying the same modeling techniques to other datasets. As
review content will differ across domains (i.e. topics covered in Amazon reviews will differ from those covered
in Yelp datasets), we will first need to build a language model from a substantially large language base such
as Wikipedia. We can then apply any of the text modeling techniques discussed in Section 4 to identify
similarity and generality across reviews. Training a language model on a broader corpus ensures better
performance of the text models in identifying patterns across reviews. We can then apply the Bayesian
Network authenticity models presented in Section 5. We can also consider training neural network-based
models that would inherently identify behavioral patterns across reviewers and text similarity patterns across
reviews.

9
10 Acknowledgements

We would like to thank Professor Bing Liu of the University of Illinois at Chicago for providing us with
labeled data from Yelp. We would also like to thank our project mentor Kratarth Goel and CS221 teaching
staff for valuable feedback.

References
[1] Sarah Perez. ”Amazon sues more sellers for buying fake reviews”, https://techcrunch.com/2016/10/
27/amazon-sues-more-sellers-for-buying-fake-reviews/, 27 Oct. 2016.
[2] Hsuan-Tien Lin, Chih-Jen Lin, and Ruby C. Weng. A Note on Platt’s Probabilistic Outputs for Support
Vector Machines. Machine Learning. August 2011.
[3] Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. Efficient Estimation of Word Representations
in Vector Space. arXiv.org. 16 January 2013.
[4] David M. Blei, Andrew Y. Ng and Michael I. Jordan. Latent Dirichlet Allocation. Journal of Machine
Learning Research. January 2003.
[5] Michael Crawford, Taghi M. Khoshgoftaar, Joseph D. Prusa, Aaron N. Richter and Hamzah Al Najada.
Survey of review spam detection using machine learning techniques. October 2015.
[6] Yinqing Xu, Bei Shi, Wentao Tian, and Wai Lam. A Unified Model for Unsupervised Opinion Spamming
Detection Incorporating Text Generality. International Joint Conference on Artificial Intelligence. July
2015.
[7] David M. Blei, Thomas L. Griffiths, Michael I. Jordan, and Joshua B. Tennenbaum. Hierarchical Topic
Models and the Nested Chinese Restaurant Process. Advances in Neural Information Processing Systems.
December 2004.
[8] Arjun Mukherjee, Vivek Venkataraman, Bing Liu, and Natalie Glance. Fake Review Detection: Classifi-
cation and Analysis of Real and Pseudo Reviews. UIC-CS Technical Report. March 2013.
[9] Arjun Mukherjee, Vivek Venkataraman, Bing Liu, and Natalie Glance. What Yelp Fake Review Filter
Might Be Doing. Proceedings of The International AAAI Conference on Weblogs and Social Media
(ICWSM). July 2013.
[10] Subhabrata Mukherjee, Sourav Dutta, and Gerhard Weikum. Credible Review Detection with Limited
Information using Consistency Features. European Conference on Machine Learning and Principles and
Practice of Knowledge Discovery in Databases. September 2016.
[11] Arjun Mukherjee, Abhinav Kumar, Bing Liu, Junhui Wang, Meichun Hsu, Malu Castellanos, and
Riddhiman Ghosh. Spotting Opinion Spammers using Behavioral Footprints. ACM SIGKDD Conference
on Knowledge Discovery and Data Mining. August 2013.
[12] Sapna Maheshwari. How Fake News Goes Viral: A Case Study. http://www.nytimes.com/2016/11/
20/business/media/how-fake-news-spreads.html

10

S-ar putea să vă placă și