Fake Product Monitoring

SYNOPSIS
 OBJECTIVE
 EXISTING SYSTEM
 PROPOSED SYSTEM
 TECHNOLOGY USED HERE
 SYSTEM ARCHITECTURE
 WORKFLOW
 MODULE DESCRIPTION
 ADVANTAGES
 DISADVANTAGES
 CONCLUSION
 FUTURE SCOPE
 REFERENCES
OBJECTIVE
 As most of the people require review about a product.
 In some review websites some good reviews are added

by the product company people itself in order to make in
order to produce false positive product reviews.
EXISTING SYSTEM
 More than 80% of the people check the reviews
 50% rely on ratings of the online product
 30% of the users compare the products review with other

products.
DRAWBACKS
 If the social media optimization team uses different ip
address to send the review, the system will fail to track the
fake review.
PROPOSED SYSTEM
 In each and every product feature we identify the review
sentences which gives an opinion it may be positive or
negative.
 The analysis of spam reviews by detecting the fake or

fraudulent reviews.
ADVANTAGES
 User gets genuine reviews about the product.
 User can spend money on valuable products
 User can post their review about the product

TECHNOLOGY USED IN FAKE
PRODUCT REVIEWS
• Sentiment Analysis
• Natural Language Processing
• Supervised & Unsupervised Learning
• Data Analysis
• Spam Review Detection
• Text mining
SENTIMENT ANALYSIS
 Sentiment analysis is the most common text classification
tool .
 It is used to analyze the incoming message.
 And tells whether the underlying sentiment is
positive , negative or neutral.
The level involved in sentiment analysis,
1. Document level
2. Aspect level
3. Sentence level
NATURAL LANGUAGE PROCESSING
 It describes the interaction between human & computers.
 Example:
-Spell check
-Auto complete
-Spam filters
-Related keyword in search engine
-Voice text messaging
 Steps involved in NLP,
1. Sentence segmentation
2. Word tokenization
3. Predicting parts of speech for each token
4. Text lemmatization
5. Identifying stop words
6. Dependency parsing
DATA ANALYSIS
 Data analysis is the process of applying statistical practices
to,
- Organize
- Represent
- Describe
- Evaluate and
- Interpret the data
SUPERVISED & UNSUPERVISED
LEARNING
 SUPERVISED LEARNING
It analyzes the training data and produces an inferred
function. Categories into,
-Regression
-Classification
UNSUPERVISED LEARNING
Is trying to find the hidden structure in labeled data.
It can be categories into,
-Clustering
SPAM REVIEW DETECTION
 Spam is defined as the any type of message or
communication originating from either a person or an
organization which is unsolicited and undesired.
 Types of spams are,
1.Email spam
2.Advertising articles
3.External link spamming
4.Citations spams
5.Product review spams
TYPES OF SPAM REVIEWS ARE,
1.Untruthful opinions
It is also known as fake reviews.
2.Reviews on brand only
Not comment on the product for the
products but only brands , the manufacturers and sellers.
3.Non reviews
 Advertisements.
 Other irrelevant reviews containing no opinion.
TEXT MINING
 Text mining is also known as TEXT DATA MINING.
 It is the process of deriving the high-quality information
from text.
 The purpose is too unstructured information , extract
meaningful numeric indices from the text.
TEXT MINING PROCESS
 Text pre-processing
 Text transformation
 Feature selection
 Data mining
 Evaluate
 Applications
1.web mining
2.medical
ALGORITHM & LIBRARIES USED IN
FAKE PRODUCT REVIEWS
 SUPPORT VECTOR MACHINE
 RANDOM FOREST CLASSIFIER
 NAIVE BAYES CLASSIFIER
 TENSORFLOW
SUPPORT VECTOR MACHINE
• Support vector machine is a supervised machine learning
algorithm.
• It can be used for both regression and classification

challenges.
• It performs classification by finding the hyper plane that

maximizes the margin between the two classes.
NAIVE BAYES CLASSIFIER
 It is a classification technique based on Bayes’ theorem with
an assumption of independence among predicators.
 Bayes theorem provides a way of calculating
POSTERIOR PROBABILITY
P(c|x)=P(x|c)P(c)
P(x)
RANDOM FOREST CLASSIFIER
 Random forest is a flexible , easy to use machine learning
algorithm.
 Random forest builds multiple decision trees and merges

them together to get a more accurate and stable prediction.
 Its one big advantages is used in both regression and

classification tasks.
TENSORFLOW
 Tensor flow is a powerful tool for deep learning and
designed to work well with big data.
 The tool performs computation by using a dataflow graph

that can be used to create machine learning models like
neural networks
SYSTEM ARCHITECTURE
User 1 User 2 User 3
Server
Admin
-product details -Track ip address

-User login details -Review added by user
-Opinion keyword -Login details
WORKING PROCESS
 Admin will add product to the system.
 Admin will delete the review which is fake.
 User once access the system , user can view
product and can post review
about the product.
 System will track the ip
address of the user .
 If the system observes the
fake review, it will inform the
admin to remove the review from the System.
MODULE DESCRIPTION
 Today, the world has been taken over by the internet. That
technological progress that takes place everyday in this
world is tremendous.
 In this fast-moving technological world, the internet is

accessed by almost everyone who owns a smartphone or a
desktop computer or a laptop
MODULE 1
 To analyze the review and give the rank to the review.
MODULE 2
 Based on the analysis determine the positive and negative
reviews in the particular dataset .
 Positive reviews
 Negative reviews
Based on their spam review detection to find the
negative reviews.
GRAPHICAL REPRESENTATION
 Use Random forest classifier method ,analyze the
review and generate the graph.
ADVANTAGES
 User gets genuine reviews about the product.
 User can post their review about the product.
 User can send money on valuable products.
DISADVANTAGES
 If the social media optimization team uses different ip
address to send the review ,system will fail to track
the fake review.
CONCLUSION
• Business organizations, specialists and academics are
putting forward their efforts and ideas to find the best
system for opinion spam analysis.
• But still no algorithm can resolve all the difficulties

faced by today‘s generation
FUTURE WORKS
 The restriction of requirement of product name in particular
product review can be removed though it might be a tough
task.
 The admin has to manually block the ip address of the
spammer account by identifying its pattern , automatic
blocking can also be achieved in the future scope of the
system.
REFERENCE
1.Anusha Sinha , Shipra singh -Fake product review
monitoring using opinion mining in 2018
2.Fake product review monitoring and removal for genuine
product reviews using opinion mining ,take off edu group ,
A division of young mind technology solutions P.Ltd.
3.Cambria , E; Schuller, b;(2013) .”new avenue In opinion
mining and sentiment analysis.
ANY QUERIES ???

Fake Product Monitoring

Încărcat de

Informații document

Titlu original

Drepturi de autor

Formate disponibile

Partajați acest document

Partajați sau inserați document

Opțiuni de partajare

Vi se pare util acest document?

Este necorespunzător acest conținut?

Drepturi de autor:

Formate disponibile

Fake Product Monitoring

Încărcat de

Drepturi de autor:

Formate disponibile

SYNOPSIS

 In some review websites some good reviews are added

 More than 80% of the people check the reviews

 50% rely on ratings of the online product

 30% of the users compare the products review with other

 The analysis of spam reviews by detecting the fake or

 User can spend money on valuable products

 User can post their review about the product

• Natural Language Processing

• Supervised & Unsupervised Learning

• Spam Review Detection

3. Predicting parts of speech for each token

5. Identifying stop words

 SUPPORT VECTOR MACHINE

 RANDOM FOREST CLASSIFIER

 NAIVE BAYES CLASSIFIER

• It can be used for both regression and classification

• It performs classification by finding the hyper plane that

 Random forest builds multiple decision trees and merges

 Its one big advantages is used in both regression and

 The tool performs computation by using a dataflow graph

-product details -Track ip address

 In this fast-moving technological world, the internet is

• But still no algorithm can resolve all the difficulties

S-ar putea să vă placă și