Sunteți pe pagina 1din 11

Future Generation Computer Systems ( ) –

Contents lists available at ScienceDirect

Future Generation Computer Systems


journal homepage: www.elsevier.com/locate/fgcs

Comparing mobile apps by identifying ‘Hot’ features


Haroon Malik a, *, Elhadi M. Shakshuki b , Wook-Sung Yoo a
a
Weisberg Division of Computer Science, Marshall University, WV, USA
b
Jodrey School of Computer Science, Acadia University, NS, Canada

highlights

• A new methodology to automatically compare mobile apps based on the opinions in the reviews.
• A semi-supervised algorithm to exploit the syntactic relations between the features and opinion words.
• An algorithm to automatically extract ‘Hot’ features from mobile app reviews.
• Similarity algorithm is utilized to provide recommendation using a tree structure.
• Empirically evaluated the methodology on twelve hundred user reviews.

article info a b s t r a c t
Article history: User review is a crucial component of open mobile app market such as the Google Play Store. These
Received 15 February 2017 markets allow users to submit feedback for downloaded apps in the form of (a) star ratings and (b)
Received in revised form 25 December 2017 opinions in the form of text reviews. Users read these reviews in order to gain insight into the app before
Accepted 3 February 2018
they buy or download it. The user opinion about the product also influences on the purchasing decisions of
Available online xxxx
potential users; that in trun, plays a key role in the generation of revenue for the developers. The mobile
Keywords: apps can contain large volume of reviews, which make it nearly impossible for a user to skim through
Opinion mining thousands of reviews to find the opinion of other users about the features which interest them the most.
Google play store Towards this end, we propose a methodology to automatically extract the features of an app from its
Sentiment analysis corresponding reviews using machine learning technique. Moreover, our proposed methodology would
aid users to compare the features across multiple apps, using the sentiments, expressed in their associated
reviews. The proposed methodology can be used to understand a user’s preference for a certain mobile
app and could uncover the reasons why users prefer one app over another.
© 2018 Elsevier B.V. All rights reserved.

1. Introduction ‘‘voice of the user(s)’’ and is helpful to drive the development


efforts and improve forthcoming release playing a key role in the
With the proliferation of smartphones, more and more mobile revenue generation for the developers [1,2]. However, there are
applications (‘‘apps’’) are introduced to the market through popu- several limitations preventing both consumers and development
lar distribution channels such as App Store and Google Play Store teams from using the information in the reviews.
where users can search, buy, and deploy software apps for mobile We highlight the limitations using an example. An app devel-
devices with a few clicks. These platforms also allow users to share oper of an email client app, after launching it via Google Play Store,
their opinion about the app in the text reviews, where they can may be curious about: (a) how well the app is received by the
express their satisfaction with a specific app feature or request customer and (b) how is the app penetrating into the market. In
order to answer such questions, the developer need to know the
a new feature. Recent empirical studies showed that app store
opinions of the people towards the features.
reviews include feedback/opinion such as user requirements, bug
Similarly, customer shopping for an email client app would like
reports, feature requests, and documentation of user experiences
to know: (a) how green is an app? (i.e., is it energy notorious by
with specific app features [1–3]. App reviews are not only useful to
draining the mobile phone battery quickly?), (b) does it support
the buyers for their purchasing decision but also extremely helpful polling mail from multiple email accounts? or (c) does it provide
for app developers. The information in the reviews represents a quick search of messages in mail box? To get the answer, a user
need to know about the opinion of people towards the product
feature(s) which helps the user to locate and buy the best app
* Corresponding author.
E-mail address: malikh@marshall.edu (H. Malik). among several app providing similar functionality.

https://doi.org/10.1016/j.future.2018.02.008
0167-739X/© 2018 Elsevier B.V. All rights reserved.

Please cite this article in press as: H. Malik, et al., Comparing mobile apps by identifying ‘Hot’ features, Future Generation Computer Systems (2018),
https://doi.org/10.1016/j.future.2018.02.008.
2 H. Malik et al. / Future Generation Computer Systems ( ) –

A straight forward proxy to find the features claimed to be (c) Crafting an algorithm to automatically discover and extract
supported by a mobile app is to read its detailed description as Hot Features in lattice structure with a partial order, so to
submitted by its developer. However, similar apps can have one or make the features comparable to similar features in other apps.
set of desired features in common. For example, Microsoft outlook, Hot-Features are the most talked features that are either well
Gmail, Yahoo and Blue mail, all support features, such as ‘signa- received by customers or those features that customers com-
tures’, ‘multiple-account support’ and ‘message search’. Reading plained a lot. Identifying Hot-Features and related opinion from
the app descriptions do not guarantee how well the feature is users becomes extremely crucial for developers to fix bugs,
implemented, if it is free from bug or it will provide function- implement new features, improve user experience agility as
ality/behavior matching what is listed in the app’s description well polish the feature of most interest to the users.
[3,4].
Like other mobile app markets, Google Play displays histograms In particular, the paper seeks the answers of the following research
of ratings and lists the comments/reviews by users, in addition questions:
to the app’s descriptions submitted by its developer. Despite the
app reviews being shorter in length (since most of them are typed RQ. 1 Can we extract features from mobile app reviews?
and submitted from smart phones), it can range from hundreds to RQ. 2 What are people’s opinions about mobile-apps based on
thousands for each app depending upon its popularity. Therefore, extracted features?
manually analyzing such a large volume of comments, especially RQ. 3 How do we make recommendation based on the sentiment
quality of ratings for specific features and then comparing then analysis of the extracted features?
against the reviews of other popular app can be hectic, time con-
suming, painful process, and sometime impossible for both users
and developers. 1.3. Paper organization

The remainder of this paper is structured as follows: in Sec-


1.1. Challenges associated with users reviews
tion 2, we describe our proposed methodology in details to answer
three research questions (RQ1, RQ2, and RQ3). We also present the
In summary, the restriction factors of using information in related works and details of our feature extraction mechanism and
users’ review are but not limited to: our novel algorithm for identifying ‘Hot-Features’. In Section 3, a
case study on six top email apps is provided and the performance
1. Manual overhead: First, app stores include a large amount of
evaluation of our proposed methodology is presented. In Section 4,
reviews, which require a large effort to be analyzed. A recent
we detail the related work. The Section 5 concludes the paper and
study found that iOS users submit on average 22 reviews
lists the future improvements.
per day per app [5]. Very popular apps such as Facebook get
more than 4000 reviews per day.
2. Methodology
2. Quality bias: the quality of the reviews varies widely,
from helpful advice and innovative ideas to insulting
We propose a methodology to facilitate both app developers
comments.
and customers to automatically extract and compare the ‘hot-
3. Rating bias: a review typically contains a sentiment mix
features’ among mobile apps, from the given the set of mobile app
concerning the different app features, making it difficult to
reviews, and gauge people sentiments towards them. The high-
filter positive and negative feedback or retrieve the feed-
level steps of the methodology are shown in Fig. 1.
back for specific features. The usefulness of the star ratings
The input for our methodology is a set of reviews for one or
in the reviews is limited for development teams, since a
more mobile apps. The output of the methodology is (a) a compar-
rating represents an average for the whole app and can
ative tree structures of mobile app features (as shown in Fig. 3) and
combine both positive and negative evaluations of the single
(b) count of the people sentiment towards the extracted features
features.
of the corresponding mobile apps.
In following subsections, we describe the details of the major
Therefore it is difficult for both consumers and developers to
steps of our methodology.
compare two or more mobile apps that offer similar function but
with different properties.
2.1. Pre-processing of the app reviews

1.2. Research contribution


The app reviews from the mobile app stores do not suffice
directly for applying our feature extraction algorithm listed in
To resolve the challenges highlighted in Section 1.1, we propose Table 3. This is due to the fact that large portion of reviews are
a systematic methodology to mine opinions from crowd sourced submitted from mobile devices on which typing is not so easy.
reviews, such as App store reviews. The methodology facilitate Therefore, we performed the following preprocessing steps as
both app developers and customers in particular to automatically shown in Fig. 2.
extract and compare the features among mobile apps, from the
given set of mobile app reviews. Further, we provide an approach 2.1.1. Noun, verb, and adjective identification
to gauge people sentiments towards the features of mobile apps. We use the part of speech (POS) tagging functionality of the
We mark our major contribution in this paper by: Natural Language Toolkit, NLTK4, for identifying and extracting the
nouns, verbs, and adjectives in the reviews which are known to
(a) Providing a feature extraction technique from microblogs, be the most important parts of speech describing features best
such as user’s comments. comparing to others parts of speech such as adverbs, numbers,
(b) Synthesizing a technique to discover syntactic relations in the or quantifiers. A manual inspection of 100 reviews confirmed this
features and opinion words. assumption.

Please cite this article in press as: H. Malik, et al., Comparing mobile apps by identifying ‘Hot’ features, Future Generation Computer Systems (2018),
https://doi.org/10.1016/j.future.2018.02.008.
H. Malik et al. / Future Generation Computer Systems ( ) – 3

Fig. 1. Overview of our proposed methodology.

Table 1
Formal Context (features) across different mail apps.
Objects Attributes
Group mail Multiple accounts Signature Quick filters Cloud space Message search Themes
Yahoo mail X X X X X
K-9 mail X X X X X
Gmail X X X X X X X
Blue mail X X X X
Microsoft outlook X X X X X X
Cloud magic X X X X

2.1.2. Stopword removal reviews, among the customers, that can be compared with that of
We remove stopwords to eliminate terms that are very com- the features of similar products/apps.
mon in the English language (e.g., ‘‘and’’, ‘‘this’’, and ‘‘is’’). We use In literature, features are also advocated as opinion targets. The
the standard list of stopwords provided by Lucene5 and expand process of feature extraction is sometimes also referred as opinion
it to include words that are common in user reviews, but are not mining. Two central problems in opinion mining are opinion lexi-
used to describe features. The words we added to the stopword list con expansion and opinion target extraction. An opinion lexicon is
are the name of the application itself, as well as the terms ‘‘app’’, a list of ‘opinion’ words such as good, bad, poor, rich, and excellent,
‘‘please’’, and ‘‘fix’’. which are used to indicate the positive and negative sentiment.
Opinion targets, i.e., features, are topics on which opinions are ex-
2.1.3. Lemmatization pressed. They are important because without knowing the targets,
We use the Wordnet16 lemmatizer from NLTK for grouping the the opinions expressed in a sentence or document are of limited
different inflected forms of words with the same part of speech use. For example, in the opinion sentence:
tag which are syntactically different but semantically equal. This
‘This is by far the best email app out there’,
step reduces the number of feature descriptors that need to be in-
spected later. With this process, for example, the terms describing email app is the target of the opinion. The feature extraction from
the verbs ‘‘sees’’ and ‘‘saw’’ are grouped into the term ‘‘see’’. the microblogs for the purpose of products, i.e., apps comparison,
is one of the main focus of the paper. Although several researchers
2.1.4. Explicit sentence have studied the opinion lexicon expansion and opinion target
Since people express opinions casually within app reviews, extraction (also known as topic, feature, or aspect extraction)
there may have either explicit or complete sentences17 , which problems, their algorithms as is do not work ‘out of the box’, for
we can easily know what they mean; or there may have implicit our purpose, i.e., with the app store reviews.
sentences that are incomplete sentences or just some phrases. For In this paper, we first provide a mechanism to extract set of
example, an implicit sentence in following is difficult for identify- features from the mobile app reviews. We use these extracted
ing its feature: feature as ‘Seeds’ to find other features. Further, we use the user’s
opinion expressed in the corresponding reviews. Although there
‘‘This game continues for a long time’’.
are several opinion lexicons publicly available, it is hard, if not
In this case, it is difficult to tell whether this sentence is referring impossible, to maintain a universal opinion lexicon to cover all
to the play time or battery life. Such sentences would have several domains as opinion expressions vary significantly from domain
different ways to express the same meaning which makes it even to domain. A word can be positive in one domain but has no
more difficult to find the patterns of features. Fortunately, we opinion or even negative opinion in another domain. Therefore, it
observed that those implicit sentences do not appear much in our is necessary to expand a known opinion lexicon for applications
data set (with less than 10% of the sentences). So we can focus on in different domains using text corpora from the corresponding
explicit statements in this paper and leave the process of implicit domains.
sentences to the future work. In a data-mining task, all the features are generally regarded
as nouns. Hu and Liu [6] also treated frequent nouns and noun-
2.2. Feature extraction phrases as product feature candidates, i.e., opinion targets. Simi-
larly, Gupta et al. [8] also testified that all features are nouns/ nouns
The task of feature extraction in this paper is to transform phrases. Therefore, a straight forward way for us to extract the
mobile app review data into a feature space that can best de- features from the mobile app review is to scrape all the nouns, from
scribe the interests of app users who comment on the app or its each sentence in the review, from the entire data set (as bag-of-
associated services. More specifically, to extract only the relevant words). This can simply be done using any linguistics parser such as
product/App features [6,7] that have appeared in the app reviews. ‘LPProcessor’. It parses each sentence and yields the part-of-speech
Our feature extraction process uses a semi-supervised algorithm, tag for each word (whether the word is a noun, verb, adjective,
listed in Table 3, to automatically extract features and opinion etc.). As a pilot study, we annotated 600 reviews consisting of
words. Further, the feature extraction process leverage an algo- 2452 lines for the top six email clients. We compared the list of
rithm, listed in Table 4, to automatically identify ‘hot-features’. feature provided at the app store with that of all nouns collected
These are the most talked/discussed features in the mobile app as bag-of-words. We found that bag-of-words do contain 98% of

Please cite this article in press as: H. Malik, et al., Comparing mobile apps by identifying ‘Hot’ features, Future Generation Computer Systems (2018),
https://doi.org/10.1016/j.future.2018.02.008.
4 H. Malik et al. / Future Generation Computer Systems ( ) –

Table 2
Dependency rules to extract meaningful features and opinion.
ID Observation Extracted Example
R11 O → amod → T s.t . o ∈ {O} Feature = T The phone has a good screen
R12 O → amod→ ‘Prod’ ← nsubj ← T s.t . O ∈ {O} Feature = T iPod is the best mp3 player
R31 T 1 → conj → T 2 OR T 2 → conj → T 1 s.t . T 1 ∈ {F } Feature = T 2 Audio and video quality of the player.
R32 T 1 → nsubj→‘has’ ← dobj ← T 2 s.t . T 2 ∈ {F } Feature = T 2 Nikon ‘‘DX’’ has a great lens.
R21 O → amod → T s.t . T ∈ {F } Opinion = O The phone has a good zoom.
R22 O → amod→‘Prod’ ← nsubj ← T s.t . T ∈ {F } Opinion = O iPod is the best mp3 player
R41 O1 → conj → O2 OR O2 → conj → O1 s.t . O1 ∈ {O} Opinion = O2 Samsung Galaxy is incredible and best.
R42 O1 → amod→‘Prod’ ← amod ← O2 s.t . O1 ∈ {O} Opinion = O2 The sexy, cool mp3 player

the features for the app listed at the app store. This means that Table 3
The pruning algorithm for bag-of-words (Nouns).
even this naïve approach of simply POS tagging the entire review
set and separating the noun is effective in terms of precision. On
the other side, the naïve approach as a huge recall, which leads
to the waste of time of developers and customers. This is due to
the fact that the naïve bag-of-word approach treats every noun as
opinion words. In reality, some nouns and phrases are irrelevant to
the corresponding mobile app.
The reason is that many times customers providing review
for an app use common adjectives to describe various subjects,
including interesting features (that are of interest to developers
and other customers/users) as well as irrelevant ones. Consider the
following example:
‘‘The publisher was generous and I was able to download and try
all the apps free for thirty days’’
Here, publisher, though being noun, is not relevant, i.e., a feature
of interest for a product, but is an infrequent feature because of the
nearby opinion word ‘generous’.

2.2.1. Syntactic relations in the features and opinion words


To overcome the problem, we exploit the fact that there exist
naturally occurring syntactic relations between the feature and
opinion words in the app store reviews. For example, in a review
‘Blue-mail has awesome scheduler’, there exists a relation that the
adjective awesome modifies the noun blue-mail in a positive man-
ner. Using such syntactic relations, we can extract nouns as well as
corresponding opinion words that are of most interest to us. Such
relations can be identified using a dependency parser based on
dependency grammar and set of rules. We modified dependency
rules inspired from [9] to remove nouns and adjectives that are not
features and opinion words respectively, from the bag-of-words,
we populated using the linguistics parser. We list the dependency
rules in Table 2.
The algorithm to prune our bag-of-words for possible features
and opinions words is listed in Table 3. The basic idea behind the
algorithm is to extract opinion words and targets (e.g., features)
by exploiting syntactic relations in a recursive manner, thereby 2.3. Discovering ‘HOT’ features
pruning the nouns in the bag-of-words. Being a semi-supervised
algorithm, initial seeds (opinion lexicon) are extracted from the So far, by using the feature extraction algorithm, we are able
set of app reviews (manually) and are provided as an input to the to extract the related features thereby reducing noise, i.e., nouns
algorithm. that do not correspond to product feature. However, app providers
If any of the seeded opinion lexicon are found in any of the as well as the customers are not interested in each and every
app reviews, the syntactic relationship between opinion words feature(s) for the product. They need to know what the most
is exploited based on the dependency rules listed in Table 2 to talked features among the customers that can be compared with
find/discover new features; the cycle is repeated back and forth that of the features of similar products. We call them the ‘Hot
between opinions and features to enable the discover of new Features’, i.e., the most talk features that are either well received
features as well as eliminating the feature that of not much interest by customers or those features that customers complained a lot.
in our bag-of-words set. Identification of ‘Hot-Features’ and related opinion from users be-
We made a pilot run by providing the algorithm with the 20 comes extremely crucial for developers to fix bugs, implement new
most occurring nouns (features) {F} as the starting seeds and 20 features, improve user experience agility as well polish the features
most occurring adjectives {O} as opinions words. By the end of the of most interest to the users.
pilot run, the set grew, containing only the features and opinion Since the feature space of a product or service is often hier-
words. archically structured and we need an approach that will extract

Please cite this article in press as: H. Malik, et al., Comparing mobile apps by identifying ‘Hot’ features, Future Generation Computer Systems (2018),
https://doi.org/10.1016/j.future.2018.02.008.
H. Malik et al. / Future Generation Computer Systems ( ) – 5

Fig. 2. Steps involved in pre-processing of mobile app reviews.

‘Hot Features’ in a lattice structure with a partial order, so to make accounts, facilitate constructing signature, allows message search
features comparable to similar features in other apps. In order to in a mail box, provide cloud storage, and have multiple themes to
make products comparable to each other, the ‘Hot Features’ need to enhance user experience. This is, of course, a toy example but it
be constructed as a tree structure which can be transformed from a is sufficient to explain the basic features of FCA. In our context,
concept lattice where some features are general and some features i.e., online app reviews, the classical FCA [10] builds up a concept
are specific. This requirement especially matches with the idea of hierarchy by comparing the subset relationships amongst the as-
discovering concept hierarchy by Formal Concept Analysis (FCA) sociated terms of a concept.
approaches. In FCA a concept can be associated with a single term or a set of
We now provide an overview of FCA, rational behind employing terms. A term is a regarded as a meaningful word not appearing in
it to discover ‘Hot-Features’, as well as detail our algorithm for dis- the stop word list. When a term is used in describing a concept, it is
covering concept hierarchy of a mobile app, required for outputting considered as an attribute of that concept. All the attributes that are
hot-features in a comparable manner. associated with all concepts can be organized in a two dimensional
In FCA, the elements of one type are called ‘‘formal objects’’, matrix: one dimension (columns) is to list all attributes and the
the elements of the other type are called ‘‘formal attributes’’. The other (rows) is to list all the concepts. Then FCA algorithm will
adjective ‘‘formal’’ is used to emphasize that these are formal check the columns that corresponding to the matrix and form a
notions. ‘‘Formal objects’’ need not be ‘‘objects’’ in any kind of lattice from that. It has been proven that there is a one-to-one
common sense meaning. But, the use of ‘‘object’’ and ‘‘attribute’’ is mapping between each matrix and its corresponding lattice [10].
indicative because in many applications it may be useful to choose It can be seen that the critical step in FCA algorithm is to generate
object-like items as formal objects and to choose their features the attribute matrix for every concept by scanning the text only
or characteristics as formal attributes. In an information retrieval once. Towards this end we apply our algorithm listed in Table 4 for
application, documents could be considered object-like and terms discovering concept hierarchy or a mobile app and outputting hot-
considered attribute-like. Other examples of sets of formal objects features in a comparable manner. The algorithm listed in Table 4
and formal attributes are tokens and types, values and data types, can deal with large volume of text. Its scans through the text
data-driven facts and theories, words and meanings and so on. only once and generate a list of hot features or properties that
The sets of formal objects and formal attributes together with can be represent the content of text. It analyzes the processed
their relation to each other form a ‘‘formal context’’, which can be reviews/comments of an app and finds out the hierarchy of the
represented by a cross table (see Table 1). The elements on the high term frequency-inverse document frequency (TFIDF) words.
left side are formal objects; the elements at the top are formal Suppose there are two random feature word in app reviews: w1 ,
attributes; and the relation between them is represented by the w2 . The reviews set that contains all the appearance of word w1 is
crosses. In this example, the formal objects are email app(s): Yahoo namely set c1 .
mail, K-9 mail, Gmail, Blue mail, Microsoft outlook and Cloud Similarly, the reviews set that contains all the appearance of
magic. word w2 is namely set c1 . If set c1 is a superset of set c2 , then more
The attributes listed in Table 1 describe the features, i.e., for- likely, w2 is a sub-concept of w1 . A tree structure is used to express
mal context of the objects; Allow group mail, support multiple the hierarchy like w1 , w2 instead of a lattice as shown in Fig. 3.

Please cite this article in press as: H. Malik, et al., Comparing mobile apps by identifying ‘Hot’ features, Future Generation Computer Systems (2018),
https://doi.org/10.1016/j.future.2018.02.008.
6 H. Malik et al. / Future Generation Computer Systems ( ) –

Table 4
Algorithm hot feature extraction.

Fig. 3. Sentimental analysis of hot features of two email apps ( Green dots - positive; Red dots - negative; Blue dots , neutral opinion).

2.4. Sentiment analysis a new DSLR camera is launched with high specifications and im-
proved technology and a website blog give all the positives of the
product with all the new specifications in it. But, when used by
Sentiment analysis, which is also called as opinion mining, is
people around the world, the same product may get many negative
an approach that requires collection of people opinion in real-time
reviews due to the DSLR’s heavy nature. So these networking sites
about a product, an event or a situation. Sentiment analysis is type give a clear picture of the situation taking into account small details
of a reality check for events, people and products. People from of the specific subject.
different demographic areas have different views on certain issues. We used the approach of sentiment analysis as a part of our
This gives a wider angle of thought to the initiator of the idea and methodology to explore people’s opinion about the hot mobile
also gives overall review on that subject or product. For example app features (features extracted using our algorithm presented in

Please cite this article in press as: H. Malik, et al., Comparing mobile apps by identifying ‘Hot’ features, Future Generation Computer Systems (2018),
https://doi.org/10.1016/j.future.2018.02.008.
H. Malik et al. / Future Generation Computer Systems ( ) – 7

Table 5
Performance evaluation of the proposed approach — Email clients.
Mobile apps C 4.5 [11] Random forest [12] LMT [13] Naïve bayes [14] Prop-methodology Reviews Features
P R F P R F P R F P R F P R F
Blue maila 0.27 0.77 0.40 0.88 0.87 0.88 0.81 0.90 0.84 0.80 0.85 0.82 0.82 0.82 0.83 100 20
K-9 mailb 0.29 0.84 0.43 0.94 0.91 0.70 0.66 0.80 0.62 0.82 0.8 0.81 0.81 0.81 0.81 100 30
Microsoft outlookc 0.35 0.75 0.48 0.99 1.00 0.99 0.88 0.80 0.85 0.75 0.79 0.77 0.77 0.77 0.81 100 30
Inbox by Gmaild 0.30 0.75 0.43 0.98 0.92 0.95 0.50 0.80 0.63 0.79 0.78 0.78 0.78 0.78 0.80 100 30
Yahoo maile 0.37 0.82 0.51 0.95 0.92 0.95 1.00 0.80 0.85 0.51 0.8 0.62 0.62 0.62 0.66 100 21
VMware boxerf 0.40 0.79 0.53 0.92 0.95 0.93 0.90 0.90 0.90 0.80 0.84 0.82 0.82 0.82 0.82 80 49
Average 0.33 0.79 0.46 0.94 0.93 0.90 0.79 0.83 0.78 0.75 0.81 0.77 0.77 0.77 0.79 580 190
Legend— P: Precision; R: Recall; F: F-Measure.
a
https://bluemail.me/.
b
https://k9mail.github.io/.
c
https://www.microsoft.com/en-us/outlook-com/mobile/.
d
https://www.google.com/inbox/.
e
https://sg.mobile.yahoo.com/mail/.
f
http://www.getboxer.com/.

Section 2.2). In general, the opinion of people can be classified into on the extracted hot features and corresponding customer’s senti-
three category; positive, negative and natural. People use certain ments.
predictable words while giving comments or writing an app review First, using the ‘Hot-Features’ we construct a feature tree. For
to express their feelings. Here are two examples: each of the ‘Hot-Feature’ corresponding reviews are analyzed to
find out the opinion of the customers, i.e., sentiment analysis
1. ‘‘I hate the app. It keeps on crashing. Don’t waste your time on technique is applied. Finally, we explored the similarity between
it’’ the feature trees using tree-similarity comparison algorithm. Each
2. ‘‘This is awesome. Love it. Works with android the best’’. app received a similarity score (between 0 and 1). For each app we
recommended apps with the similarity threshold of 0.7 and above.
The first review express negative feelings towards and app. It Since the pool of our mobile email app in the case study listed in
uses a sentiments word ‘hate’ to express negative feeling. Whereas, Section 3 is of managed size, i.e., six top email app, we validated
the second review express positive feeling through an adjective the results manually.
‘‘awesome’’ and a verb ‘‘love’’. All these words are essential words
that reflects the user’s sentiment. However, there are many slang 3. Case study
words that have no meanings such as ‘Ummmm’, ‘Urrr’. ‘phew’,
‘oh man’ and ‘huhh’. We removed such words since they act as The purpose of the case study is two folds. (a) To evaluate the
noise and do not contribute towards sentiment analysis. Then, performance of our proposed methodology and (b) validate the
we narrow down these input words again by using WordNet16 to construction of our tree-similarity algorithm.
eliminate the words that are seldom used. We also deleted the
none-existing words. By tagging the existing words, a bitmap is 3.1. Target system
established (listing all those existing words, tagging the existing
words appeared in a review with value 1 the others with 0). Also To evaluate the effectiveness of our methodology, we need
tagging the orientation of each sentence is based on the sentimen- the mobile app reviews. Mobile apps are available through app
tal orientation like positive, negative, or neutral. Besides, people’s stores such as the Apple App Store, the Blackberry World Store,
emotion can be divided into more types. WordNet16 has divided the Google Play Store, Microsoft Phone Apps Store and many more
some sentiment words into six types including disgust, anger, fear, specialized or regional app stores. Such stores provide a convenient
sad, surprise and joy. Each of these six types shows different levels and efficient medium for users to download apps and to provide
of emotions which may make the analysis more sophisticated. The feedback on their user-experience through mobile app user re-
taxonomy of product features provides an overview of hot features views. We choose two of the most popular app stores, the Google
as well as the results of sentiment analysis of those features as Play Store and the Apple App Store. Our criteria for selection is
shown in Fig. 3. based on popularity (the Google Play Store and the Apple App Store
We collected around 600 reviews for the six popular email apps are the top two most popular app stores).
listed in Table 5. Using our methodology, we found that all of the The Google Play Store is a digital distribution outlet run by
email app are well received by the customers. The Fig. 3 shows Google. Apart from apps, the Play Store sells other digital media,
the comparison between the features of Microsoft Outlook and e.g., e-books, movies and music. The Google Play Store has over
Thunder bird email client. 1,000,000 apps available as of July 2013 [4]. The Apple App Store is
the digital distribution outlet for Apple where users can download
2.5. Recommendations third-party apps. The apps range from games, productivity apps,
social networking and business apps. There are approximately
The second step of our methodology based on the qualitative 775,000 apps in the App Store as of July 2013 [4].
and quantitative analysis on customers preferred features. The There are paid and free apps available in both stores. Apps can
main motivation behind this recommendation is that mobile apps be downloaded and updated from the stores. Once downloaded
in similar category have strong similarities. For example, two email a user can review the app. The number of reviews associated
app(s), both support multiple accounts, signature and message to an app varies depending on the popularity of the app. Some
search feature. If one customer likes one of these mobile app, popular apps have over a million user reviews. Reviews in both
probably she/he would like the other one as well. This step of our stores contain a title, a date, a numerical rating between 1 and 5
methodology employee Weighted-Tree Similarity Algorithm [15] (where 1 represents a poor app) and a comment section where the

Please cite this article in press as: H. Malik, et al., Comparing mobile apps by identifying ‘Hot’ features, Future Generation Computer Systems (2018),
https://doi.org/10.1016/j.future.2018.02.008.
8 H. Malik et al. / Future Generation Computer Systems ( ) –

Table 6
Performance evaluation of the proposed approach — Browsers.
Mobile apps C 4.5 [11] Random forest [12] LMT [13] Naïve bayes [14] Prop-methodology Reviews Features
P R F P R F P R F P R F P R F
Firefox 0.37 0.82 0.51 0.95 0.92 0.95 1 0.8 0.85 0.51 0.8 0.62 0.62 0.62 0.78 100 26
UC Browserb 0.35 0.75 0.48 0.99 1 0.99 0.88 0.8 0.85 0.75 0.79 0.77 0.77 0.77 0.81 100 32
Via Browserc 0.4 0.8 0.53 0.92 0.95 0.93 0.9 0.91 0.92 0.8 0.84 0.82 0.82 0.82 0.8 100 55
Ghostery Privacyd 0.3 0.75 0.43 0.98 0.9 0.95 0.5 0.8 0.63 0.79 0.78 0.78 0.78 0.78 0.83 100 19
Dolphin Broswere 0.27 0.77 0.4 0.88 0.87 0.88 0.81 0.9 0.84 0.8 0.85 0.82 0.82 0.82 0.83 100 20
Mercury Browserf 0.31 0.8 0.41 0.92 0.9 0.7 0.66 0.8 0.62 0.82 0.8 0.81 0.81 0.81 0.82 100 27
Average 0.34 0.8 0.45 0.92 0.92 0.91 0.8 0.83 0.75 0.76 0.82 0.75 0.79 0.77 0.8 600 179
Legend— P: Precision; R: Recall; F: F-Measure.
a
https://support.mozilla.org/mobile.
b
http://www.ucweb.com/.
c
http://via.1year.cc/.
d
https://www.ghostery.com/.
e
https://dolphin.com/.
f
https://mercury-browser.com/.

user is free to write whatever they wish. We believe our choice methodology using the ground truth based of classical definition
regarding to the selection of app store for mining app reviews is of Precision, Recall and F-measures listed in Eqs. (1)–(3).
representative, and that our results also apply to other app-stores |classified features ∩ actual features|
since they follow a similar rating system. Precision = (1)
classified features
|classified features ∩ actual features|
3.2. Review data Recall = (2)
actual features
(α + 1) .Precision.Recall
We developed a dedicated web crawler using ‘scrapy’ [16] F -Measureα = . (3)
framework. Since the crawler is not the main subject of the paper, (α.Precision + Recall)
we do not provide details on its implementation. A manual vali- Our methodology of extracting the features from mobile app
dation of large volume of app comments is required, therefore, we reviews is a semi-supervised methodology. We also used four
limit the case study on (a) the top six mobile email apps listed in learners, LMT, Random Forest, Naivebase and C 4.5 present in
Table 5 and (b) six of the most top browser apps listed in Table 6. Weka data mining software to compare the performance of our
Our web crawler visits each unique page with a specific iOS or methodology. The ground truth is used as a label class for all the
Google play store and parses the user-reviews to extract data such Tree based learners, i.e., LMT, Random Forest and C 4.5. The result
as the app name, the review title, the review-comment and the reported in table are the average for 10-fold cross validation.
numerical star-rating assigned by the user. We collected all the As expected C 4.5 performed the least. This was expected from
reviews for each app listed in Tables 5 and 6 for a week. To avoid the the simple tree model. As a matter of fact we incorporated C
4.5 learner to be our baseline for the comparison purpose. The
bias, we randomly selected one hundred comments for each app.
proposed methodology and Naive bays performed equally well
For the evaluation purpose of our methodology, need reviews that
with overall precession between 0.75 and 0.78 and average recall
capture both the positive as well as negative opinions of an app’s
between 0.71 and 0.88. Nevertheless, out approach do not require
feature. Therefore, we dropped the comments with 5 star rating as
continuous training. Moreover, it simple require few features (as
well as the comments with 1 to 2 star ratings. seed) to identify and extract other relevant features from that of
This is based on the fact, that according to previous litera- the review set.
ture [17,18], one-star & two star reviews are only indicative of
negative issues/opinion. We confirm our groupings of bad and 4. Related work
good reviews by running a sentiment analysis tool [19] over all
of the bad (one-star and two-star), neutral (three-star) and good Our proposed methodology listed in Section 2 is mostly relevant
(four-star and five-star) reviews of the studied apps. This tool to Hu and Liu [6] and Popescu and Etzioni [20]. In [6], they use Part-
estimates the positive and negative sentiment expressed in the of-Speech (POS) tagging to collect nouns or noun phrases since
text. As indicated by the previous studies, one-star and two-star features are nouns mostly. They produced POS tags for each word
reviews were given a negative score while all four-star and five- (whether the word is a noun or a verb). After that, association rule
star reviews were given a positive score. mining is applied to filter out the frequent feature item sets. The
result of their research shows a good performance in analyzing
electronic products like DVD player, MP3 player, digital camera and
3.3. Performance evaluation of the proposed methodology
cellular phone. Obviously, our research is related but different from
their work in many ways, i.e., POS tagging and association rules
We selected only one hundred reviews (all with the rating of mainly focused on noun features which may skip some words of
three and four stars) rating of the (a) six of the most popular email their inputs that can imply features. For instance, there are some
apps except for VmWare Boxer email app. For Boxer app, after the email mobile app that people prefer, ‘multiple account support’
elimination of one, two and five star reviews, we were left with ones rather than single account. In such condition, people may talk
only 80 reviews and (b) six of the most popular browser apps Two about their preference about ‘‘multiple account’’ when they refer
members manually read all the reviews and extracted features and to an app’s feature. But ‘‘multiple account’’ is adjective in those
the opinion words from each review. A disagreement on a feature sentences. Which means it would be filtered off when they try
and opinion words was resolved by a vote from a third member. to sum up all the features. Our system based on the feature ex-
This ‘ground-truth’ was stored in a repository. We evaluated the traction does not have this problem. We did not remove words by

Please cite this article in press as: H. Malik, et al., Comparing mobile apps by identifying ‘Hot’ features, Future Generation Computer Systems (2018),
https://doi.org/10.1016/j.future.2018.02.008.
H. Malik et al. / Future Generation Computer Systems ( ) – 9

part of speech. Instead, we comprehensively analyze input words are domain-independent. For example, unpredictable is often a
from both frequency and relationship between different words. positive opinion word in movie reviews, as in unpredictable plot,
Moreover, they use comments on mobile apps from ecommerce but in car reviews unpredictable is likely to be negative, as in
web sites as input. While we use data from Google Play Store that unpredictable steering. Our approach extracts opinion words using
have a large number of short text with sparse words, which makes domain dependent corpora; thus we are able to find domain-
association rules not applicable. Nevertheless, they demonstrated dependent opinion words.
their algorithm with a small data set (500 records), while we tested
our algorithm in similar capacity, i.e., with 600 mobile app reviews 4.2. Opinion target extraction
of six most popular email Apps. Our work is also different from
the feature extraction method in [20], that they perform mining of Opinion target (or topic) extraction is a difficult task in opin-
consumer reviews and sentiment classification without comparing ion mining. Several methods have been proposed, mainly in the
the pair of user-specified products based on the corresponding context of product review mining [20,32]. In this mining task,
product features. opinion targets usually refer to product features, which are defined
as product components or attributes, as in Liu [37].
4.1. Opinion word extraction In the work of Hu and Liu [32], frequent nouns and noun
phrases are treated as product feature candidates. In our work,
Extensive work has been done on sentiment analysis at word, we also extract only noun targets. Different pruning methods are
expression [21,22], sentence [23,24] and document [25,26] levels. proposed to remove the noise. To cover infrequent features that are
We only describe work at word level as it is most relevant to our missed, they regard the nearest nouns/noun phrases of the opinion
work. In general, the existing work can be categorized as corpora- words identified by frequent features as infrequent features. In
based and [17,27–31] and dictionary-based [24,32–35]. Our work Popescu and Etzioni [20], the authors investigated the same prob-
falls into the corpora-based category. Hatzivassiloglou et al. [27] lem. Their extraction method, however, requires that the product
proposed the first method for determining adjective polarities or class is known in advance. The algorithm determines whether a
orientations (positive, negative, and neutral). The method predicts noun/noun phrase is a feature by computing the PMI score be-
orientations of adjectives by detecting pairs of such words con- tween the phrase and class-specific discriminators through a Web
joined by conjunctions such as ‘and’ and ‘or’ in a large document search. Querying the Web is a problem, as discussed earlier. We
set. The underlying intuition is that the orientations of conjoined will compare these two representative methods with our approach
adjectives are subject to some linguistic constraints. For example, in the experiments.
in the sentence ‘‘This car is beautiful and spacious’’, if we know that In Scaffidi et al. [38], the authors proposed a language model
beautiful is positive, we can infer that spacious is positive too. The approach to product feature extraction with the assumption that
weakness of this method is that as it relies on the conjunction product features are mentioned more often in a product review
relations it is unable to extract adjectives that are not conjoined. than they are mentioned in general English text. However, statis-
Wiebe et al. [28,30] proposed an approach to finding subjective ad- tics may not be reliable when the corpus is small, as pointed out
jectives using the results of word clustering according to their dis- earlier.
tributional similarity. However, they did not tackle the prediction The recent work by Kobayashi, Inui, and Matsumoto [39] fo-
of sentiment polarities of the found subjective adjectives. Turney cused on the aspect–evaluation (aspect and evaluation mean the
et al. [31] compute the point wise mutual information (PMI) of the opinion target and opinion word respectively in our context) and
target term with each seed positive and negative term as a measure aspect-of extraction problems in blogs. Their aspect–evaluation
of their semantic association. Their work requires additional access extraction uses syntactic patterns learned via pattern mining to
to the Web (or any other corpus similar to the Web to ensure extract (aspect, evaluation) pairs. Our work differs from theirs in
sufficient coverage), which is time consuming. Another recent that we make use of syntactic relations from dependency trees.
corpora-based approach is proposed by Kanayama et al. [29]. Their Additionally, we consider not only the relations of opinion targets
work first uses clause level context coherency to find candidates, and opinion words, but also many other types of relations.
then uses a statistical estimation method to determine whether In Stoyanov and Cardie [40], the authors treated target extrac-
the candidates are appropriate opinion words. Their method for tion as a topic conference resolution problem. The key to their ap-
finding candidates would have low recall if the occurrences of seed proach is to cluster opinions sharing the same target together. They
words in the data are infrequent or an unknown opinion word proposed to train a classifier to judge if two opinions are on the
has no known opinion words in its context, however. Besides, the same target, which indicates that their approach is supervised. Our
statistical estimation can be unreliable if the corpus is small, which work differs from theirs in that our approach is semi-supervised.
is a common problem for statistical approaches. Other related work on target extraction mainly uses the idea of
Dictionary-based approaches [34] take advantage of WordNet topic modeling to capture targets in reviews (Mei et al. 2007 [18]).
to construct a synonymy network by connecting pairs of synony- Topic modeling is to model the generation of a document set and
mous words. The semantic orientation of a word is decided by its mine the implied topics in the documents. However, our experi-
shortest paths to two seed words good and bad which are chosen ments with topic modeling show that it is only able to find some
as representatives of positive and negative orientations. Esuli and general or coarse topics in texts and represent them as clusters of
Sebastiani [35] use text classification techniques to classify orien- words. Their aim is thus different from our fine-grained opinion
tations. Their method is based on the glosses (textual definitions) target extraction task.
in an on-line ‘‘glossary’’ or dictionary. The work of Takamura,
Inui, and Okumura [33] also exploits the gloss information from 5. Conclusions and future work
dictionaries. The method constructs a lexical network by linking
two words if one appears in the gloss of the other. The weights With the popularity of smartphones and mobile devices, mobile
of links reflect if these two connected words are of the same application (i.e. apps) market have been growing exponentially
orientation. The works of Hu and Liu [36] and Kim and Hovy [24] in terms of user and downloads. App developers spend consid-
are simpler as they simply used synonyms and antonyms. How- erable time and effort on collecting and exploiting user feedback
ever, all dictionary-based methods are unable to find domain to improve user satisfaction. On other hand, users use the com-
dependent sentiment words because most entries in dictionaries ments/reviews to get insight into the experience, opinions and

Please cite this article in press as: H. Malik, et al., Comparing mobile apps by identifying ‘Hot’ features, Future Generation Computer Systems (2018),
https://doi.org/10.1016/j.future.2018.02.008.
10 H. Malik et al. / Future Generation Computer Systems ( ) –

sentiments of other users about specific features and descriptions • Sentiment topic recognition: It should be noted that while
of experiences with these features. However, for many apps, the we presented a methodology to infer features and distin-
amount of reviews is too large to be process manually and their guish consumers’ opinions, our methodology does not re-
quality varies largely. Therefore, we proposed a methodology that veal the underlying reason behind forming such opinions.
automatically extracts the ‘Hot Features’ of the mobile apps from Future research using sentiment, topic recognition (STR)
the reviewers comment, mine the feelings of users towards those will be conducted to determine the most representative
features and recommend users the mobile app with similar hot topics discussed behind each sentiment. Using STR, it should
features. The proposed methodology demonstrates an excellent be possible to gain overall knowledge regarding the under-
balance between precision and recall in comparison to tree based lying causes of positive or negative sentiments. It should
techniques and Naïve Bayes. In future research, we will improve also be noted that while the lexicon-based approach used
the effectiveness and scalability of our method for mining social by our proposed methodology can detect basic sentiments,
opinions on a wide range of products and services. Below is the list such approach may sometimes fall short of recognizing the
and details of our future research avenues. subtle forms of linguistic expression used in situations such
as sarcasm, irony or provocation. We plan to overcome this
• Custom-built sentiment classifier: Our goals for future challenge as part of our future work.
improvements of the proposed approach initially involve
the integration of a custom-built sentiment classifier in References
our proposed methodology. A further aim is to integrate a
fully automatic ontology-building functionality, potentially [1] W. Maalej, D. Pagano, On the socialness of software, in: Proceedings of IEEE
through a combination of ontology learning techniques. Ninth International Conference on Dependable, Autonomic and Secure Com-
Nevertheless, we also plan to experiment with the manual puting, DASC, 2011, pp. 864–871.
[2] N. Seyff, F. Graf, N. Maiden, Using mobile re tools to give end-users their own
and semi-automatic ontology creation approaches, as they
voice, in: 2010 18th IEEE International Requirements Engineering Conference,
offer a more controlled means for building the domain vo- 2010, pp. 37–46.
cabulary (in our case App Store). [3] S. Ma, et al., Active semi-supervised approach for checking app behavior
• Adaptive learners: We have used four learners, LMT, Ran- against its description, in: Computer Software and Applications Conference
domForest, Naive Bayes and C4.5 decision tree, to compare (COMPSAC), 2015 IEEE 39th Annual, 2015, pp. 179–184.
[4] A. Gorla, et al., Checking app behavior against app descriptions, in: Proceedings
the performance of our methodology. Most heuristics and
of the 36th International Conference on Software Engineering, 2014, pp. 1025–
learners presented in literature learn on a single dimension 1035.
(e.g., type of apps: browses or mailers) or a combination [5] D. Pagano, W. Maalej, User feedback in the appstore: An empirical study, in:
of several dimensions (for example, a feature embedded in 2013 21st IEEE International Requirements Engineering Conference, RE, 2013,
different categories of reviews such as games, news and pp. 125–134.
weather). However, these heuristics do not adapt to take [6] M. Hu, B. Liu, Mining opinion features in customer reviews, in: AAAI, 2004, pp.
755–760.
into account the vigorous nature of the mobile app devel- [7] B. Pang, L. Lee, S. Vaithyanathan, Thumbs up?: Sentiment classification using
opment and the dynamic nature of feature evolution. For ex- machine learning techniques, in: Proceedings of the ACL-02 Conference on
ample, for a version-1 of an app, Random Forest (a tree based Empirical Methods in Natural Language Processing-Volume 10, 2002, pp. 79–
classifier) may perform well on the historically collected 86.
review sets. However, Random Forest may perform badly [8] N. Gupta, S. Chandra, Product Feature Discovery and Ranking for Sentiment
Analysis from Online Reviews, 2013.
for distinguishing among a feature or an opinion among the [9] G. Qiu, et. al, Opinion word expansion and target extraction through double
reviews in a new version of same app, i.e., version-2. We plan propagation, Comput. Linguist. 37 (2011) 9–27.
on exploiting an adaptive-heuristic or machine learner. Such [10] B. Ganter, R. Wille, Applied lattice theory: Formal concept analysis, in: G.
family of heuristics/ML uses a best heuristic table (BHT) to Grätzer (Ed.), General Lattice Theory, Birkhäuser, 1997.
ensure that always most optimal heuristics/ML is used for [11] J.R. Quinlan, C4. 5: Programs for Machine Learning, Elsevier, 2014.
[12] L. Breiman, Random forests, Mach. Learn. 45 (2001) 5–32.
discovering features on an app. We do have a good experi-
[13] N. Landwehr, M. Hall, E. Frank, Logistic model trees, Mach. Learn. 59 (2005)
ence of using adaptive heuristics, in different capacity; for 161–205.
four large open-source software system [41]. [14] G.H. John, P. Langley, Estimating continuous distributions in bayesian classi-
• Generalizability: Generalizability of our methodology is fiers, in: Proceedings of the Eleventh Conference on Uncertainty in Artificial
one of the most important extension of our future work. Intelligence, 1995, pp. 338–345.
[15] V. Bhavsar, H. Boley, L. Yang, A weighted-tree similarity algorithm for multi-
Presently, we examined over one thousand reviews, across
agent systems in e-business environments, 2004.
twelve apps from Google Play Store to identify 369 features. [16] J. Wang, Y. Guo, Scrapy-based crawling and user-behavior characteristics
Since our proposed methodology helps users to compare analysis on taobao, in: 2012 International Conference on Cyber-Enabled Dis-
products (i.e., Mobile apps in the context of our study) using tributed Computing and Knowledge Discovery, CyberC, 2012, pp. 44–52.
product reviews, which are usually short and fall into the [17] N. Kaji, M. Kitsuregawa, Building lexicon for sentiment analysis from massive
collection of HTML documents, in: EMNLP-CoNLL, 2007, pp. 1075–1083.
category of micro-blogs, we will extend our work to other
[18] Q. Mei, et al., Topic sentiment mixture: Modeling facets and opinions in
crowdsource and other micro-blog centric social media. weblogs, in: Proceedings of the 16th International Conference on World Wide
These include twitter (tweet is limited to 140 characters), Web, 2007, pp. 171–180.
YouTube, and Amazon. Harvesting data from these sites will [19] M. Thelwall, K. Buckley, G. Paltoglou, Sentiment strength detection for the
also be a challenge and will open a new avenue for our social web, J. Amer. Soc. Inf. Sci. Technol. 63 (2012) 163–173.
[20] A. Popescu, O. Etzioni, Extracting product features and opinions from reviews,
future work. We have already laid out a framework to mine
in: Natural Language Processing and Text MiningAnonymous, Springer, 2007,
user comments for YouTube videos. We have over 80 million pp. 9–28.
comments to date and the number is increasing [17]. We are [21] E. Breck, Y. Choi, C. Cardie, Identifying expressions of opinion in context, in:
in processing a platform for mining streams of twitter data IJCAI, 2007, pp. 2683–2688.
at ultra-large scale. Validating our approach on multiple [22] H. Takamura, T. Inui, M. Okumura, Extracting semantic orientations of words
using spin model, in: Proceedings of the 43rd Annual Meeting on Association
sources will (a) establish an evidence of its generalizabil-
for Computational Linguistics, 2005, pp. 133–140.
ity in capturing consumer opinions and gaining knowledge [23] H. Yu, V. Hatzivassiloglou, Towards answering opinion questions: Separating
about consumer preferences and (b) gauge an unbiased facts from opinions and identifying the polarity of opinion sentences, in:
representation of consumer sentiment towards services and Proceedings of the 2003 Conference on Empirical Methods in Natural Language
brand. Processing, 2003, pp. 129–136.

Please cite this article in press as: H. Malik, et al., Comparing mobile apps by identifying ‘Hot’ features, Future Generation Computer Systems (2018),
https://doi.org/10.1016/j.future.2018.02.008.
H. Malik et al. / Future Generation Computer Systems ( ) – 11

[24] S. Kim, E. Hovy, Determining the sentiment of opinions, in: Proceedings of the Haroon Malik is an Assistant Professor at Weisberg Di-
20th International Conference on Computational Linguistics, 2004, p. 1367. vision of Computer Science, Marshall University, USA. He
[25] B. Pang, L. Lee, S. Vaithyanathan, Thumbs up?: Sentiment classification using has extensive research background in software engineer-
machine learning techniques, in: Proceedings of the ACL-02 Conference on ing, particularly in performance testing of ultra-large scale
Empirical Methods in Natural Language Processing-Volume 10, 2002, pp. 79– systems, ambient technologies and green computing. He
86. has organized many conferences and workshops such as
[26] P.D. Turney, Thumbs up or thumbs down?: Semantic orientation applied to Sensor Cloud (SC 2013), Large Scale Testing (LT 2013)
unsupervised classification of reviews, in: Proceedings of the 40th Annual and International Conference on Ambient Systems (2013–
Meeting on Association for Computational Linguistics, 2002, pp. 417–424. 2014). He has also served on the program committees of
[27] V. Hatzivassiloglou, K.R. McKeown, Predicting the semantic orientation of dozens of IEEE- and ACM-sponsored conferences.
adjectives, in: Proceedings of the Eighth Conference on European Chapter of
the Association for Computational Linguistics, 1997, pp. 174–181.
[28] J. Wiebe, Learning subjective adjectives from corpora, in: AAAI/IAAI, 2000, pp.
Elhadi M. Shakshuki is a professor and Wheelock Chair
735–740.
in the Jodrey School of Computer Science at Acadia Uni-
[29] H. Kanayama, T. Nasukawa, Fully automatic lexicon expansion for domain-
versity, Canada. His research interests include Intelligent
oriented sentiment analysis, in: Proceedings of the 2006 Conference on Em-
Agents, Pervasive and Ubiquitous Computing, Distributed
pirical Methods in Natural Language Processing, 2006, pp. 355–363.
Systems, Handled Computing, and Wireless Sensor Net-
[30] J. Wiebe, et al., Learning subjective language, Comput. Linguist. 30 (2004) 277– works. He is the founder and head of the Cooperative
308. Intelligent Distributed Systems Group at the School of
[31] P.D. Turney, M.L. Littman, Measuring praise and criticism: Inference of seman- Computer Science, Acadia University. He received his B.Sc.
tic orientation from association, ACM Trans. Inf. Syst. (TOIS) 21 (2003) 315– degree in Computer Engineering in 1984 from Tripoli Uni-
346. versity, Libya, while his M.Sc. and Ph.D. degrees in Sys-
[32] M. Hu, B. Liu, Mining and summarizing customer reviews, in: Proceedings of tems Design Engineering respectively in 1994 and 2000
the Tenth ACM SIGKDD International Conference on Knowledge Discovery and from the University of Waterloo, Canada. Prof. Shakshuki is the Editor-in-Chief
Data Mining, 2004, pp. 168–177. of the International Journal of Ubiquitous Systems and Pervasive Networks. He
[33] H. Takamura, T. Inui, M. Okumura, Extracting semantic orientations of words serves on the editorial board of several international journals and contributed
using spin model, in: Proceedings of the 43rd Annual Meeting on Association in many international conferences and workshops with different roles, as a pro-
for Computational Linguistics, 2005, pp. 133–140. gram/general/steering conference chair and numerous conferences and workshops
[34] J. Kamps, et al., Using wordnet to measure semantic orientations of adjectives, as a program committee member. He published over 200 research papers in inter-
2004. national journals, conferences and workshops. He is the founder of the following
[35] A. Esuli, F. Sebastiani, Determining the semantic orientation of terms through international conferences: ANT (2010–2017), EUSPN (2010–2017), FNC (2006–
gloss classification, in: Proceedings of the 14th ACM International Conference 2017), ICTH (2011–2017), MobiSPC (2004–2017), and SEIT (2011–2017). He is
on Information and Knowledge Management, 2005, pp. 617–624. also a founder of other international symposia and workshops. In addition, Prof.
[36] Benyuan Liu, D. Towsley, A study of the coverage of large-scale sensor net- Shakshuki is the president of the International Association for Sharing Knowledge
and Sustainability, Canada, and has guest co-edited over 30 international journal
works, in: 2004 IEEE International Conference on Mobile Ad-Hoc and Sensor
special issues. He is a senior member of IEEE, and a member of ACM, SIGMOD, IAENG
Systems, 2004, pp. 475–483.
and APENS.
[37] B. Liu, Web Data Mining: Exploring Hyperlinks, Contents, and Usage Data,
Springer Science & Business Media, 2007.
[38] C. Scaffidi, et al., Red opal: Product-feature scoring from reviews, in: Proceed-
Wook-Sung Yoo is a Professor of Weisberg Division of
ings of the 8th ACM Conference on Electronic Commerce, 2007, pp. 182–191. Computer Science at Marshall University. His research
[39] N. Kobayashi, K. Inui, Y. Matsumoto, Extracting aspect-evaluation and aspect- areas include Artificial Intelligence, image processing, op-
of relations in opinion mining, in: EMNLP-CoNLL, 2007, pp. 1065–1074. timization, Informatics, and Mobile/Web Application. He
[40] V. Stoyanov, C. Cardie, Topic identification for fine-grained opinion analy- was a chair of the IMIA (International Medical Informat-
sis, in: Proceedings of the 22nd International Conference on Computational ics Association) Dental Informatics working group and
Linguistics-Volume 1, 2008, pp. 817–824. involved in various nation wide web and informatics
[41] H. Malik, A.E. Hassan, Supporting software evolution using adaptive change projects.
propagation heuristics, in: 2IEEE International Conference on Software Main-
tenance, Beijing, 2008, pp. 177–186.

Please cite this article in press as: H. Malik, et al., Comparing mobile apps by identifying ‘Hot’ features, Future Generation Computer Systems (2018),
https://doi.org/10.1016/j.future.2018.02.008.

S-ar putea să vă placă și