Sunteți pe pagina 1din 17

Sistem Rekomendasi dan

Personalisasi
Pertemuan Ke-6
(Metode Pemodelan SR: Content-based)

Noor Ifada
noor.ifada@trunojoyo.ac.id

S1 Teknik Informatika – Universitas Trunojoyo Madura (UTM) Semester Genap 2019-2020


Sub Pokok Bahasan
Content-based

Referensi:
Chapter 4: Aggarwal C C [2016].
Recommender Systems – The Textbook.
Springer International Publishing
Switzerland.

S1 Teknik Informatika – Universitas Trunojoyo Madura (UTM) 1


Metode Pemodelan Sistem Rekomendasi
User-based
Memory-based
(Neighbourhood-
based)
Item-based

Decision and
Collaborative Regresion Trees
Filtering
Rule-based
Content-
based
Naive Bayes
Model Sistem Knowledge-
Model-
Rekomendasi based
based Neural
Network
Demographic
Latent Factor
Hybrid and
Ensemble-based
Integrating
Factorization and
Neighbourhood

S1 Teknik Informatika – Universitas Trunojoyo Madura (UTM) 2


Content-based
 Collaborative Filtering (CF)  tidak memerlukan
informasi/deskripsi item
 Content-based (CB):
◦ mengeksploitasi informasi/deskripsi item
◦ merekomendasikan target user untuk memilih item yang similar
dengan item yang telah dia pilih/sukai sebelumnya
 Data yang dibutuhkan:
1. Deskripsi yang berkaitan dengan content
dari item  contoh: sinopsis film, genre film
2. User profiles yang mendeskripsikan
preferensi user (berdasarkan item yang telah dia
pilih/sukai sebelumnya)  contoh: rating (explicit
feedback), click history (implicit feedback)

S1 Teknik Informatika – Universitas Trunojoyo Madura (UTM) 3


Content-based [2]
2. User profiles (preferensi user) 1. Deskripsi content item
user rating item
Spaghetti is a long, thin, cylindrical, solid pasta. It is
a staple food of traditional Italian cuisine. Like other
pasta, spaghetti is made of milled wheatand water.
Italian spaghetti is made from durum wheat semolina,
4 but elsewhere it may be made with other kinds of flour.

4 Lasagna is wide, flat-shaped pasta, commonly refers


to a dish made with several layers of lasagne sheets
alternated with sauces and other ingredients, such as
meats and cheese.
2
Ice cream cone is a dry, cone-shaped pastry, usually made
of a wafer similar in texture to a waffle, which enables ice
3 cream to be held in the hand and eaten without
a bowl or spoon. Various types of ice cream cones
include wafer (or cake) cones, waffle cones, and sugar
cone.
5 Coke is a drink that typically contains carbonated
water(although orange soda and lemonade, among
others, are usually not carbonated), a sweetener, and
4 a natural or artificial flavoring. The sweetener may
be sugar, high-fructose corn syrup, fruit juice, sugar
substitutes (in the case of diet drinks), or some
combination of these. Soft drinks may also
contain caffeine, colorings, preservatives, and other
ingredients.

S1 Teknik Informatika – Universitas Trunojoyo Madura (UTM) 4


Komponen dalam Sistem Content-based

• Preprocessing & feature


extraction
1

• Learning user profiles &


recommendation
2

S1 Teknik Informatika – Universitas Trunojoyo Madura (UTM) 5


Preprocessing & Feature Extraction
1. Feature Extraction
◦ Data: deskripsi yang berkaitan dengan content
dari item
◦ Domain-specific
◦ Representasi: keyword-based vector-space,
multidimensional (structured)
◦ NOTE:
Suatu item mungkin memiliki berbagai atribut yang
mendeskripsi berbagi aspek dari item tersebut . Contoh:
item buku bisa jadi memiliki deskripsi berupa:
judul (title)
pengarang (author)
genre
kata kunci (keyword)
Pembobotan biasanya diterapkan pada masing-masing
atribut untuk memfasilitasi penggunaan mereka di dalam
proses klasifikasi

S1 Teknik Informatika – Universitas Trunojoyo Madura (UTM) 6


Preprocessing & Feature Extraction [2]
 Contoh deskripsi yang berkaitan dengan
content dari item
Title Genre Author Type Price Keywords
The Memoir David Carr Paperback 29.90 Press and journalism,
Night of drug addiction,
the Gun personal memoirs,
New York
The Lace Fiction, Brunonia Hardcover 49.90 American
Reader Mystery Barry contemporary
fiction, detective,
historical
Into the Romance, Suzanne Hardcover 45.90 American fiction,
Fire Suspense Brockmann murder, neo-Nazism

S1 Teknik Informatika – Universitas Trunojoyo Madura (UTM) 7


Preprocessing & Feature Extraction [3]
1. Feature Extraction
2. Feature Representasion & Cleaning
• Fase feature extraction dapat menentukan bags of
words dari deskripsi item
• Perlu proses cleaning agar didapatkan representasi
yang dapat digunakan untuk proses selanjutnya:
i. Stop-word removal: menghilangkan kata-kata yang tidak memiliki
arti khusus terhadap item  article (a, an, the, ...), preposition
(behind, below, beside, ...), conjuction (and, or, ...), pronoun (I, me,
mine, myself, she, her, ...)
ii. Stemming: menghilangkan variasi kata  bentuk singular atau
plural, tenses
iii. Phrase extraction: mendeteksi kata-kata yang muncul bersama
dalam dokumen (berdasarkan frekuensi kemunculannya)  hot
dog, United Nations, ...
• Representasi: vector-space model  bag of words &
frekuensi kemunculannya (TF-IDF)

S1 Teknik Informatika – Universitas Trunojoyo Madura (UTM) 8


Preprocessing & Feature Extraction [4]
1. Feature Extraction
2. Feature Representasion & Cleaning
3. Collecting User Likes & Dislikes
• Untuk mengetahui user profiles (preferensi
user)
• Bentuk data: Rating (explicit feedback), Implicit
feedback, Text opinions
4. Supervised Feature Selection & Weighting
• Untuk memastikan bahwa hanya kata-kata yang
paling penting yang digunakan dalam representasi
vector-space model
• Metode: Gini Index, Entropy, X2-Statistic, Normalized
Deviation, Feature Weighting
S1 Teknik Informatika – Universitas Trunojoyo Madura (UTM) 9
Learning User Profiles & Recommendation
 Content-based adalah model yang user-specific
◦ memprediksi preferensi user terhadap item berdasarkan
user feedback  histori preferensi user tersebut (misal
dari data pembelian atau rating item)
◦ User feedback digunakan bersama dengan deskripsi
item untuk membuat data pelatihan
 Learning model dikonstruksi berdasarkan
data pelatihan

S1 Teknik Informatika – Universitas Trunojoyo Madura (UTM) 10


Learning User Profiles & Recommendation [2]
 Contoh data pelatihan (sistem rekomendasi film):
◦ Deskripsi (yang berkaitan dengan content dari) item
Title Genre Author Type Price Keywords
The Night Memoir David Carr Paperback 29.90 Press and journalism,
of the drug addiction, personal
Gun memoirs, New York

The Lace Fiction, Brunonia Barry Hardcover 49.90 American contemporary


Reader Mystery fiction, detective,
historical

Into the Romance, Suzanne Brockmann Hardcover 45.90 American fiction,


Fire Suspense murder, neo-Nazism

◦ User profiles/User feedback/Preferensi user


Title Genre Author Type Price Keywords

… Fiction Brunonia, Barry, Ken Paperback 25.65 Detective, murder,


Follett New York

S1 Teknik Informatika – Universitas Trunojoyo Madura (UTM) 11


Learning User Profiles & Recommendation [3]
 Metode untuk learning:
◦ Nearest Neighbour Classification
◦ Bayes Classifier
◦ Rule-based Classifiers
◦ Regression-based Models
 Hasil learning user profiles digunakan untuk
menghasilkan daftar rekomendasi item untuk
target user

S1 Teknik Informatika – Universitas Trunojoyo Madura (UTM) 12


Review Content-based
 Kelebihan:
◦ user independence – tidak bergantung pada user lain
◦ transparency – hasil rekomendasi mudah untuk dijelaskan
dan dipahami
◦ solving cold-start problem  item baru tetap dapat
direkomendasikan
 Batasan:
◦ limited content analysis:
 Deskripsi content tidak dapat diekstrak secara otomatis
 Tidak ada domain knowledge
 Data keywords kurang
◦ overspecialization – item selalu sejenis, item terlalu serupa
◦ cold-start problem  apabila ada user baru, tidak ada
rating/informasi mengenai user tersebut

S1 Teknik Informatika – Universitas Trunojoyo Madura (UTM) 13


Recommendation Model

Content- Collaborative
based Filtering

S1 Teknik Informatika – Universitas Trunojoyo Madura (UTM) 14


Tugas Mandiri
 Pelajari contoh sederhana content-based
recommender system:
https://www.analyticsvidhya.com/blog/2015/
08/beginners-guide-learn-content-based-
recommender-systems/

S1 Teknik Informatika – Universitas Trunojoyo Madura (UTM) 15


S1 Teknik Informatika – Universitas Trunojoyo Madura (UTM)

S-ar putea să vă placă și