Sunteți pe pagina 1din 15

Homework Due Lesson 6 – Association Rules

All homework will be team homework but all team members must know the material.
Submit all homework to both instructor and graduate assistant Md Ali (ma03901n@pace.edu).
---------------------------**************--------------------------
Complete Association Rules exercise 4 (end of chapter, page 159 in textbook) manually.
A local retailer has a database that stores 10,000 transactions of last summer. After
analyzing the data, a data science team has identified the following statistics:
{battery} appears in 6,000 transactions.
{sunscreen} appears in 5,000 transactions.
{sandals} appears in 4,000 transactions.
{bowls} appears in 2,000 transactions.
{battery,sunscreen} appears in 1,500 transactions.
{battery,sandals} appears in 1,000 transactions.
{battery,bowls} appears in 250 transactions.
{battery,sunscreen,sandals} appears in 600 transactions.
Answer the following questions:
1. What are the support values of the preceding itemsets?

{battery} appears in 6,000 transactions. So, support (sunscreen)= 6000/10000 = 0.6


{sunscreen} appears in 5,000 transactions. So, support (sunscreen)= 5000/10000 = 0.5
{sandals} appears in 4,000 transactions. So, support (sandals)= 4000/10000 = 0.4
{bowls} appears in 2,000 transactions. So, support (bowls)= 2000/10000 = 0.2
{battery,sunscreen} appears in 1,500 transactions. So, support (battery,sunscreen)= 1500/10000 =
0.15
{battery,sandals} appears in 1,000 transactions. So, support (battery,sandals)= 1000/10000 = 0.1
{battery,bowls} appears in 250 transactions. So, support (battery,bowls)= 250/10000 = 0.025
{battery,sunscreen,sandals} appears in 600 transactions. So, support (battery,sunscreen,sandals)=
600/10000 = 0.06

2. Assuming the minimum support is 0.05, which item sets are considered frequent?

The support of a frequent itemset should be greater than or equal to the minimum support. As the
minimum support is 0.05, Itemsets {battery}, {sunscreen}, {sandals}, {bowls}, {battery,sunscreen},
{battery,sandals}, and {battery,sunscreen,sandals} are considered frequent itemsets at the
minimum support 0.05. Only {battery,bowls} is not frequent itemsets.

3. What are the confidence values of {battery}→{sunscreen} and {battery, sunscreen}→


{sandals}? Which of the two rules is more interesting?
Confidence (battery→sunscreen) = support (battery, sunscreen)/ support (battery) = 0.15/0.6= 0.25.
Which means 25% of the time a customer buys battery, sunscreen is bought as well.
Confidence ({battery, sunscreen}→{sandals}) = support (battery, sunscreen, sandals )/ support
(battery, sunscreen) = 0.06/0.15= 0.4. Which means 40% of the time a customer buys battery and
sunscreen, sandals is bought as well.
The second rule ({battery, sunscreen}→{sandals}) is more interesting because it shows that 40% of
the time a customer buys battery and sunscreen, sandals is bought as well.

4. List all the candidate rules that can be formed from the statistics. Which rules are considered
interesting at the minimum confidence 0.25? Out of these interesting rules, which rule is
considered the most useful (that is, least coincidental)?

Support:

{battery} appears in 6,000 transactions. So, support (sunscreen)= 6000/10000 = 0.6


{sunscreen} appears in 5,000 transactions. So, support (sunscreen)= 5000/10000 = 0.5
{sandals} appears in 4,000 transactions. So, support (sandals)= 4000/10000 = 0.4
{bowls} appears in 2,000 transactions. So, support (bowls)= 2000/10000 = 0.2
{battery,sunscreen} appears in 1,500 transactions. So, support (battery,sunscreen)= 1500/10000 =
0.15
{battery,sandals} appears in 1,000 transactions. So, support (battery,sandals)= 1000/10000 = 0.1
{battery,bowls} appears in 250 transactions. So, support (battery,bowls)= 250/10000 = 0.025
{battery,sunscreen,sandals} appears in 600 transactions. So, support (battery,sunscreen,sandals)=
600/10000 = 0.06

Confidence:
Confidence(x→y)=support(x,y)/support(x)
Confidence (battery → sunscreen) = support (battery, sunscreen)/ support (battery) = 0.15/0.6= 0.25
Confidence (sunscreen → battery) = support (battery, sunscreen)/ support (sunscreen) = 0.15/0.5=
0.3
Confidence (battery → sandals) = support (battery, sandals) / support (battery) = 0.1/0.6= 0.17
Confidence (sandals → battery) = support (battery, sandals) / support (sandals) = 0.1/0.4=0.25
Confidence (battery → bowls) = support (battery, bowls)/ support (battery) = 0.025/0.6= 0.042
Confidence (bowls → battery) = support (battery, bowls)/ support (bowls) = 0.025/0.2= 0.125
Confidence ({battery}→{sunscreen, sandals }) = support (battery, sunscreen, sandals )/ support
(battery) = 0.06/0.6=0.1
Confidence ({sunscreen}→{battery, sandals}) = support (battery, sunscreen, sandals )/ support
(sunscreen) = 0.06/0.5=0.12
Confidence ({battery, sandals}→{sunscreen}) = support (battery, sunscreen, sandals )/ support
(battery, sandals) = 0.06/0.1=0.6
Confidence ({sandals}→{battery, sunscreen }) = support (battery, sunscreen, sandals )/ support
(sandals) = 0.06/ 0.4=0.15
Confidence ({battery, sunscreen}→{sandals}) = support (battery, sunscreen, sandals )/ support
(battery, sunscreen) = 0.06/0.15= 0.4

So considering the minimum confidence value =0.25, the interesting rules are
 Confidence (battery → sunscreen) = support (battery, sunscreen)/ support (battery) =
0.15/0.6= 0.25 [ means that there is 25% chance that customer will buy sunscreen if the
customer buy battery only]
 Confidence (sunscreen → battery) = support (battery, sunscreen)/ support (sunscreen) =
0.15/0.5= 0.3 [There is 30% chance that a customer will buy battery, if the customer
buy sunscreen only.]
 Confidence (sandals → battery) = support (battery, sandals) / support (sandals) =
0.1/0.4=0.25 [There is 25% chance that a customer will buy battery, if the customer
buy sandals only.]
 Confidence ({battery, sandals}→{sunscreen}) = support (battery, sunscreen, sandals )/
support (battery, sandals) = 0.06/0.1=0.6 [There is 60% chance that a customer will buy
sunscreen, if the customer buy battery and sandals together.]
 Confidence ({battery, sunscreen}→{sandals}) = support (battery, sunscreen, sandals )/
support (battery, sunscreen) = 0.06/0.15= 0.4 [there is 40% chance that a customer will buy
sandals, if the customer buy battery and sunscreen together.]

Lift
Lift(x → y)=support(x, y)/{support(x)*support(y)
Lift (battery → sunscreen) = support (battery, sunscreen)/ support (battery)* support ( sunscreen)
= 0.15/(0.6*0.5)= 0.5
Lift (sandals → battery) = support (battery, sandals) / support (battery)* support (sandals) =
0.1/(0.6*0.4)= 0.42
Lift ({battery, sunscreen}→{sandals}) = support (battery, sunscreen, sandals )/ support (battery,
sunscreen)* support (sandals) = 0.06/(0.15*0.4)= 1
Lift ({battery, sandals}→{sunscreen}) = support (battery, sunscreen, sandals )/ support (battery,
sandals)* support (sunscreen) = 0.06/(0.1*0.5)= 1.2
Therefore it can be concluded that ({battery, sandals}→{sunscreen}) have a stronger
association than others.
Leverage
Leverage(x → y)=support(x, y)-{support(x)*support(y)}
Leverage (battery → sunscreen) = support (battery, sunscreen) – {support (battery)* support (
sunscreen)} = 0.15 - (0.6*0.5)= -0.15
Leverage (sandals → battery) = support (battery, sandals) – {support (battery)* support (sandals)} =
0.1 - (0.6*0.4)= - 0.14
Leverage (battery → bowls) = support (battery, bowls) – {support (battery)* support (bowls)} = 0.025
- (0.6*0.2)= - 0.1
Leverage ({battery, sunscreen}→{sandals}) = support (battery, sunscreen, sandals ) – {support
(battery, sunscreen)* support (sandals) }= 0.06 - (0.15*0.4)= 0
Leverage ({battery, sandals}→{sunscreen}) = support (battery, sunscreen, sandals ) – {support
(battery, sandals)* support (sunscreen) }= 0.06 - (0.1*0.5)= 0.01
It again confirms that ({battery, sandals}→{sunscreen}) have a stronger association than
others.
So by doing Lift and Leverage candidate rules we can conclude that ({battery,
sandals}→{sunscreen}) rule is most useful.
Important Notes: Confidence is able to identify trustworthy rules, but it cannot tell whether a rule
is coincidental. A high-confidence rule can sometimes be misleading because confidence does not
consider support of the itemset in the rule consequent. Measures such as lift and leverage not only
ensure interesting rules are identified but also filter out the coincidental rules.
-----------------------*************------------------------------
Given the following 10 grocery store transactions, use appropriate association rule thresholds to
find a few interesting rules both by hand and by using R.

1. beer, diapers
2. soda, potato chips, hamburger meat, milk, eggs
3. coffee, eggs
4. beer, bread, cheese, ham
5. diapers, beer, potato chips
6. cheese, ham, beer
7. ham, cheese, bread, coffee, milk
8. soda, cheese, bread, ham
9. coffee, hamburger meat
10. eggs, diapers, beer

R Code:
library('arules')
library('arulesViz')
purchases <- c("beer,diapers",
"soda,potato,chips,hamburger,meat,milk,eggs",
"coffee,eggs",
"beer,bread,cheese,ham",
"diapers,beer,potato,chips",
"cheese,ham,beer",
"ham,cheese,bread,coffee,milk",
"soda,cheese,bread,ham",
"coffee,hamburger,meat",
"eggs,diapers,beer")

# write to a basket file


data <- paste(purchases, sep="\n")
write(data, file = "purchases")
# read transcations from puchases "basket" file
trans <- read.transactions("purchases", format = "basket", sep=",")
inspect(trans)
summary(trans)
items2 <- apriori(trans, parameter=list(minlen=2, maxlen=2, support=0.3))
summary(items2)
inspect(sort(items2, by ="support"))
items3 <- apriori(trans, parameter=list(minlen=3, maxlen=3, support=0.3))
summary(items3)
inspect(sort(items3, by ="support"))
items4 <- apriori(trans, parameter=list(minlen=4, maxlen=4, support=0.3))
summary(items4)
rules <- apriori(trans, parameter=list(minlen=2, support=0.3))
summary(rules)
inspect(rules)
rules <- apriori(trans, parameter=list(minlen=2, support=0.3, confidence=0.3,
target = "rules"))
summary(rules)
inspect(rules)
plot(rules)
plot(rules@quality)

confidentRules <- rules[quality(rules)$confidence > 0.3]


inspect(confidentRules)
plot(confidentRules, method="matrix", control=list(reorder=TRUE))
inspect(head(sort(rules, by="lift"), 10))
highConfidenceRules <- head(sort(rules, by="confidence"), 5)
plot(highConfidenceRules, method="graph", control=list(type="items"))
highLiftRules <- head(sort(rules, by="lift"), 5)
plot(highLiftRules, method="graph", control=list(type="items"))
# plot parallel coordinates of the candidate rules
plot(rules, method="paracoord", control=list(reorder=TRUE))

Console Output

# HW6: Extra Exercise


# CS816 Big Data Analytics

#################
# Extra Exercise
#################

library('arules')

## Loading required package: Matrix


##
## Attaching package: 'arules'
##
## The following objects are masked from 'package:base':
##
## %in%, abbreviate, write
library('arulesViz')

## Loading required package: grid


##
## Attaching package: 'arulesViz'
##
## The following object is masked from 'package:arules':
##
## abbreviate
##
## The following object is masked from 'package:base':
##
## abbreviate

## create the dataset file using basket format


purchases <- c("beer,diapers",
"soda,potato,chips,hamburger,meat,milk,eggs",
"coffee,eggs",
"beer,bread,cheese,ham",
"diapers,beer,potato,chips",
"cheese,ham,beer",
"ham,cheese,bread,coffee,milk",
"soda,cheese,bread,ham",
"coffee,hamburger,meat",
"eggs,diapers,beer")

# write to a basket file


data <- paste(purchases, sep="\n")
write(data, file = "purchases")

# read transcations from puchases "basket" file


trans <- read.transactions("purchases", format = "basket", sep=",")
inspect(trans)

## items
## 1 {beer,diapers}
## 2 {chips,eggs,hamburger,meat,milk,potato,soda}
## 3 {coffee,eggs}
## 4 {beer,bread,cheese,ham}
## 5 {beer,chips,diapers,potato}
## 6 {beer,cheese,ham}
## 7 {bread,cheese,coffee,ham,milk}
## 8 {bread,cheese,ham,soda}
## 9 {coffee,hamburger,meat}
## 10 {beer,diapers,eggs}

summary(trans)

## transactions as itemMatrix in sparse format with


## 10 rows (elements/itemsets/transactions) and
## 13 columns (items) and a density of 0.2846154
##
## most frequent items:
## beer cheese ham bread coffee (Other)
## 5 4 4 3 3 18
##
## element (itemset/transaction) length distribution:
## sizes
## 2 3 4 5 7
## 2 3 3 1 1
##
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 2.0 3.0 3.5 3.7 4.0 7.0
##
## includes extended item information - examples:
## labels
## 1 beer
## 2 bread
## 3 cheese

# apply apriori on the itemsets in the transactions

# frequent 2-itemsets
items2 <- apriori(trans, parameter=list(minlen=2, maxlen=2, support=0.3))

##
## Parameter specification:
## confidence minval smax arem aval originalSupport support minlen maxlen
## 0.8 0.1 1 none FALSE TRUE 0.3 2 2
## target ext
## rules FALSE
##
## Algorithmic control:
## filter tree heap memopt load sort verbose
## 0.1 TRUE TRUE FALSE TRUE 2 TRUE
##
## apriori - find association rules with the apriori algorithm
## version 4.21 (2004.05.09) (c) 1996-2004 Christian Borgelt
## set item appearances ...[0 item(s)] done [0.00s].
## set transactions ...[13 item(s), 10 transaction(s)] done [0.00s].
## sorting and recoding items ... [7 item(s)] done [0.00s].
## creating transaction tree ... done [0.00s].
## checking subsets of size 1 2 done [0.00s].
## writing ... [5 rule(s)] done [0.00s].
## creating S4 object ... done [0.00s].

summary(items2)

## set of 5 rules
##
## rule length distribution (lhs + rhs):sizes
## 2
## 5
##
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 2 2 2 2 2 2
##
## summary of quality measures:
## support confidence lift
## Min. :0.30 Min. :1 Min. :2.0
## 1st Qu.:0.30 1st Qu.:1 1st Qu.:2.5
## Median :0.30 Median :1 Median :2.5
## Mean :0.34 Mean :1 Mean :2.4
## 3rd Qu.:0.40 3rd Qu.:1 3rd Qu.:2.5
## Max. :0.40 Max. :1 Max. :2.5
##
## mining info:
## data ntransactions support confidence
## trans 10 0.3 0.8

inspect(sort(items2, by ="support"))

## lhs rhs support confidence lift


## 4 {cheese} => {ham} 0.4 1 2.5
## 5 {ham} => {cheese} 0.4 1 2.5
## 1 {diapers} => {beer} 0.3 1 2.0
## 2 {bread} => {cheese} 0.3 1 2.5
## 3 {bread} => {ham} 0.3 1 2.5

# frequent 3-itemsets
items3 <- apriori(trans, parameter=list(minlen=3, maxlen=3, support=0.3))

##
## Parameter specification:
## confidence minval smax arem aval originalSupport support minlen maxlen
## 0.8 0.1 1 none FALSE TRUE 0.3 3 3
## target ext
## rules FALSE
##
## Algorithmic control:
## filter tree heap memopt load sort verbose
## 0.1 TRUE TRUE FALSE TRUE 2 TRUE
##
## apriori - find association rules with the apriori algorithm
## version 4.21 (2004.05.09) (c) 1996-2004 Christian Borgelt
## set item appearances ...[0 item(s)] done [0.00s].
## set transactions ...[13 item(s), 10 transaction(s)] done [0.00s].
## sorting and recoding items ... [7 item(s)] done [0.00s].
## creating transaction tree ... done [0.00s].
## checking subsets of size 1 2 3 done [0.00s].
## writing ... [2 rule(s)] done [0.00s].
## creating S4 object ... done [0.00s].
summary(items3)

## set of 2 rules
##
## rule length distribution (lhs + rhs):sizes
## 3
## 2
##
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 3 3 3 3 3 3
##
## summary of quality measures:
## support confidence lift
## Min. :0.3 Min. :1 Min. :2.5
## 1st Qu.:0.3 1st Qu.:1 1st Qu.:2.5
## Median :0.3 Median :1 Median :2.5
## Mean :0.3 Mean :1 Mean :2.5
## 3rd Qu.:0.3 3rd Qu.:1 3rd Qu.:2.5
## Max. :0.3 Max. :1 Max. :2.5
##
## mining info:
## data ntransactions support confidence
## trans 10 0.3 0.8

inspect(sort(items3, by ="support"))

## lhs rhs support confidence lift


## 1 {bread,cheese} => {ham} 0.3 1 2.5
## 2 {bread,ham} => {cheese} 0.3 1 2.5

# frequent 4-itemsets
items4 <- apriori(trans, parameter=list(minlen=4, maxlen=4, support=0.3))

##
## Parameter specification:
## confidence minval smax arem aval originalSupport support minlen maxlen
## 0.8 0.1 1 none FALSE TRUE 0.3 4 4
## target ext
## rules FALSE
##
## Algorithmic control:
## filter tree heap memopt load sort verbose
## 0.1 TRUE TRUE FALSE TRUE 2 TRUE
##
## apriori - find association rules with the apriori algorithm
## version 4.21 (2004.05.09) (c) 1996-2004 Christian Borgelt
## set item appearances ...[0 item(s)] done [0.00s].
## set transactions ...[13 item(s), 10 transaction(s)] done [0.00s].
## sorting and recoding items ... [7 item(s)] done [0.00s].
## creating transaction tree ... done [0.00s].
## checking subsets of size 1 2 3 done [0.00s].
## writing ... [0 rule(s)] done [0.00s].
## creating S4 object ... done [0.00s].

summary(items4)

## set of 0 rules

##############################
# Generate and Visualize Rules
##############################

# run Apriori without max (7 rules 100% confidence)


rules <- apriori(trans, parameter=list(minlen=2, support=0.3))

##
## Parameter specification:
## confidence minval smax arem aval originalSupport support minlen maxlen
## 0.8 0.1 1 none FALSE TRUE 0.3 2 10
## target ext
## rules FALSE
##
## Algorithmic control:
## filter tree heap memopt load sort verbose
## 0.1 TRUE TRUE FALSE TRUE 2 TRUE
##
## apriori - find association rules with the apriori algorithm
## version 4.21 (2004.05.09) (c) 1996-2004 Christian Borgelt
## set item appearances ...[0 item(s)] done [0.00s].
## set transactions ...[13 item(s), 10 transaction(s)] done [0.00s].
## sorting and recoding items ... [7 item(s)] done [0.00s].
## creating transaction tree ... done [0.00s].
## checking subsets of size 1 2 3 done [0.00s].
## writing ... [7 rule(s)] done [0.00s].
## creating S4 object ... done [0.00s].

summary(rules)

## set of 7 rules
##
## rule length distribution (lhs + rhs):sizes
## 2 3
## 5 2
##
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 2.000 2.000 2.000 2.286 2.500 3.000
##
## summary of quality measures:
## support confidence lift
## Min. :0.3000 Min. :1 Min. :2.000
## 1st Qu.:0.3000 1st Qu.:1 1st Qu.:2.500
## Median :0.3000 Median :1 Median :2.500
## Mean :0.3286 Mean :1 Mean :2.429
## 3rd Qu.:0.3500 3rd Qu.:1 3rd Qu.:2.500
## Max. :0.4000 Max. :1 Max. :2.500
##
## mining info:
## data ntransactions support confidence
## trans 10 0.3 0.8

inspect(rules)

## lhs rhs support confidence lift


## 1 {diapers} => {beer} 0.3 1 2.0
## 2 {bread} => {cheese} 0.3 1 2.5
## 3 {bread} => {ham} 0.3 1 2.5
## 4 {cheese} => {ham} 0.4 1 2.5
## 5 {ham} => {cheese} 0.4 1 2.5
## 6 {bread,cheese} => {ham} 0.3 1 2.5
## 7 {bread,ham} => {cheese} 0.3 1 2.5

# (11 rules with 30% confidence)


rules <- apriori(trans, parameter=list(minlen=2, support=0.3, confidence=0.3,
target = "rules"))

##
## Parameter specification:
## confidence minval smax arem aval originalSupport support minlen maxlen
## 0.3 0.1 1 none FALSE TRUE 0.3 2 10
## target ext
## rules FALSE
##
## Algorithmic control:
## filter tree heap memopt load sort verbose
## 0.1 TRUE TRUE FALSE TRUE 2 TRUE
##
## apriori - find association rules with the apriori algorithm
## version 4.21 (2004.05.09) (c) 1996-2004 Christian Borgelt
## set item appearances ...[0 item(s)] done [0.00s].
## set transactions ...[13 item(s), 10 transaction(s)] done [0.00s].
## sorting and recoding items ... [7 item(s)] done [0.00s].
## creating transaction tree ... done [0.00s].
## checking subsets of size 1 2 3 done [0.00s].
## writing ... [11 rule(s)] done [0.00s].
## creating S4 object ... done [0.00s].

summary(rules)

## set of 11 rules
##
## rule length distribution (lhs + rhs):sizes
## 2 3
## 8 3
##
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 2.000 2.000 2.000 2.273 2.500 3.000
##
## summary of quality measures:
## support confidence lift
## Min. :0.3000 Min. :0.6000 Min. :2.000
## 1st Qu.:0.3000 1st Qu.:0.7500 1st Qu.:2.500
## Median :0.3000 Median :1.0000 Median :2.500
## Mean :0.3182 Mean :0.8955 Mean :2.409
## 3rd Qu.:0.3000 3rd Qu.:1.0000 3rd Qu.:2.500
## Max. :0.4000 Max. :1.0000 Max. :2.500
##
## mining info:
## data ntransactions support confidence
## trans 10 0.3 0.3

inspect(rules)

## lhs rhs support confidence lift


## 1 {diapers} => {beer} 0.3 1.00 2.0
## 2 {beer} => {diapers} 0.3 0.60 2.0
## 3 {bread} => {cheese} 0.3 1.00 2.5
## 4 {cheese} => {bread} 0.3 0.75 2.5
## 5 {bread} => {ham} 0.3 1.00 2.5
## 6 {ham} => {bread} 0.3 0.75 2.5
## 7 {cheese} => {ham} 0.4 1.00 2.5
## 8 {ham} => {cheese} 0.4 1.00 2.5
## 9 {bread,cheese} => {ham} 0.3 1.00 2.5
## 10 {bread,ham} => {cheese} 0.3 1.00 2.5
## 11 {cheese,ham} => {bread} 0.3 0.75 2.5

# visualization of the selected rules


plot(rules)
plot(rules@quality)

# 11 rules matrix
confidentRules <- rules[quality(rules)$confidence > 0.3]
inspect(confidentRules)

## lhs rhs support confidence lift


## 1 {diapers} => {beer} 0.3 1.00 2.0
## 2 {beer} => {diapers} 0.3 0.60 2.0
## 3 {bread} => {cheese} 0.3 1.00 2.5
## 4 {cheese} => {bread} 0.3 0.75 2.5
## 5 {bread} => {ham} 0.3 1.00 2.5
## 6 {ham} => {bread} 0.3 0.75 2.5
## 7 {cheese} => {ham} 0.4 1.00 2.5
## 8 {ham} => {cheese} 0.4 1.00 2.5
## 9 {bread,cheese} => {ham} 0.3 1.00 2.5
## 10 {bread,ham} => {cheese} 0.3 1.00 2.5
## 11 {cheese,ham} => {bread} 0.3 0.75 2.5

plot(confidentRules, method="matrix", control=list(reorder=TRUE))

## Itemsets in Antecedent (LHS)


## [1] "{cheese}" "{cheese,ham}" "{ham}" "{bread,ham}"
## [5] "{bread}" "{bread,cheese}" "{diapers}" "{beer}"
## Itemsets in Consequent (RHS)
## [1] "{cheese}" "{beer}" "{diapers}" "{bread}" "{ham}"
# displays rules with top lift scores
inspect(head(sort(rules, by="lift"), 10))

## lhs rhs support confidence lift


## 3 {bread} => {cheese} 0.3 1.00 2.5
## 4 {cheese} => {bread} 0.3 0.75 2.5
## 5 {bread} => {ham} 0.3 1.00 2.5
## 6 {ham} => {bread} 0.3 0.75 2.5
## 7 {cheese} => {ham} 0.4 1.00 2.5
## 8 {ham} => {cheese} 0.4 1.00 2.5
## 9 {bread,cheese} => {ham} 0.3 1.00 2.5
## 10 {bread,ham} => {cheese} 0.3 1.00 2.5
## 11 {cheese,ham} => {bread} 0.3 0.75 2.5
## 1 {diapers} => {beer} 0.3 1.00 2.0

# graph the 5 rules with the highest CONFIDENCE


highConfidenceRules <- head(sort(rules, by="confidence"), 5)
plot(highConfidenceRules, method="graph", control=list(type="items"))
# graph the 5 rules with the highest LIFT
highLiftRules <- head(sort(rules, by="lift"), 5)
plot(highLiftRules, method="graph", control=list(type="items"))

# plot parallel coordinates of the candidate rules


plot(rules, method="paracoord", control=list(reorder=TRUE))

# references
# http://www.rdatamining.com/examples/association-rules
# http://statistical-research.com/data-frames-and-transactions/

S-ar putea să vă placă și