Documente Academic
Documente Profesional
Documente Cultură
Recommender systems are part science, part art and part technology. A recommender
systems is intended to solve 2 particular tasks:
1. To predict the rating for an item or product, the user has not rated yet.
2. To create the list of the top N recommended items.
2
USER BASED COLLABORATIVE FILTERING ALGORITHM
user. Weights are the similarities of these users with the target
item
3
1.1.1.2. Item - based
Calculates similarities between items.
Builds a model of item similarities by retrieving all items rated by
an active user from the user -item matrix, it determines how
similar the retrieved items are to the target item.
Then select the k most similar items.
Take the weighted average of the active users rating on the similar
items k. Weights are the similarities of the items.
4
1.1.2.2. Cosine based (linear algebra approach)
Measures the similarity between two n-dimensional vectors based on
the angle between them (projection based) Different from
correlation based. Cosine similarity is based on linear algebra rather
than statistical approach.
r a ,i r b , i
i
s ( a , b )=
r r
i
2
a ,i
i
2
b ,i
s ( a , b) rb , i
i=1
p ( a ,i )= n
s ( a , b)
i=1
5
How? Given a set of transactions where each transaction is a set
of items, an association rule applies the form A B. If A is in the
transaction then B is likely to be in it as well.
1.2.2. Clustering
Tries to partition a set of data into a set of clusters to discover
meaningful groups within them.
Once clusters have been formed, the opinions of other users in a
cluster can be averaged and used to make recommendations for
individual users.
K means and Self Organizing Map (SOP) are the most popular
clustering methods.
1.2.5. Regression
Regression analysis is used when two or more variables are
thought to be linearly related.
It is usually used for curve fitting, prediction, and hypothesis
testing about relationships between variables.
6
1.2.7. Matrix Completion Technique
The objective of MCT is to predict the unknown values within the
user item matrices.
Correlation based K-nearest neighbor is one of the major
techniques in collaborative filtering.
K- nearest neighbor depend on historical rating data of users on
items.
The rating matrix is always very big and sparse because users do
not rate most of the items in the matrix.
Theres a need to analyze low rank and partially observed
matrix through Alternating Least Squares (ALS)
1.2.8. Latent factors Models
1.2.9. Singular Value Decomposition
Case 1: With Ratings (apply algorithms for real ratings data memory based)
2. Content-based Filtering
- Recommendation is based on user profile using features of the content of the
items the user has evaluated in the past.
7
- Do not require the profile of other users since
they do not influence recommendation.
- Items that are mostly related to the positively
rated items are recommended to the user
- Most successful for web pages, publication, and
news recommendations
- Advantage: They can recommend new items
even if there are no rating provided by users.
If the user preference changes, it has
the ability to adjust its recommendation
quickly.
- Disadvantage: Needs to have an in-depth
knowledge and description of the features of the
items in the profile.
8
II. Evaluation Metrics for Recommendation Algorithms
1. Mean Absolute Error(MAE) most popular and commonly used. It is a measure of
deviation of recommendation from users specific value. The lower the MAE the
more accurate the recommendation engine predicts user rating
2. Root Mean Square Error puts more emphasis on larger absolute error. The lower,
the better recommendation accuracy.
9
n is the total number of ratings in the item set.
3. Reversal rate
4. Weighted errors
5. Receiver Operating Curve (ROC)
6. Precision Recall Curve (PRC)
7. Precision, Recall and F -measure.
2. Time Sensitive Recommender Systems. a movie may be very different at the time
of release from the recommendations received several years later. In such cases, it
is extremely important to incorporate temporal knowledge in the recommendation
process.
5. Other challenges
a. Scalability
b. Proactivity (when and how to push recommendations)
c. Privacy
d. Diversity diversity of the items recommended
e. Integration integration of long -term and short term preference of
customers.
10
f. Device should operate on any device
REFERENCES:
READ:
User -based Collaborative Filtering
http://www.dataperspective.info/2014/05/basic-recommendation-engine-using-r.html
http://bigdata-doctor.com/recommender-systems-101-practical-example-in-r/
https://ashokharnal.wordpress.com/2014/12/18/using-recommenderlab-for-predicting-
ratings-for-movielens-data/
11
http://stackoverflow.com/questions/17610104/how-recommenderlab-of-r-culculate-the-
ratings-of-each-item-in-ratingmatrix
FURTHER RESEARCH:
Choi, K., & Suh, Y. (2013). A new similarity function for selecting neighbors for each target
item in collaborative filtering. Knowledge-Based Systems, 37, 146-153.
Liu, H., Hu, Z., Mian, A., Tian, H., & Zhu, X. (2014). A new user similarity model to improve
the accuracy of collaborative filtering. Knowledge-Based Systems, 56, 156-166.
Bobadilla, J., Ortega, F., Hernando, A., & Glez-de-Rivera, G. (2013). A similarity metric
designed to speed up, using hardware, the recommender systems k-nearest neighbors
algorithm. Knowledge-Based Systems, 51, 27-34.
Lathia, N., Hailes, S., & Capra, L. (2008, March). The effect of correlation coefficients on
communities of recommenders. In Proceedings of the 2008 ACM symposium on Applied
computing (pp. 2000-2005). ACM.
12