Sunteți pe pagina 1din 4

A good clustering method will produce high quality clusters with

Select one:
high inter class similarity
high intra class similarity
no inter class similarity
low intra class similarity

If an item set ‘XYZ’ is a frequent item set, then all subsets of that frequent item set are

Select one:
Not frequent
Frequent
Undefined
Can not say

In _________ clusterings, points may belong to multiple clusters

Select one:
Partial
Fuzzy
Exclusive
Non exclusivce

This clustering approach initially assumes that each data instance represents a single cluster.

Select one:
conceptual clustering

agglomerative clustering
K-Means clustering
expectation maximization

Which statement about outliers is true?

Select one:
The nature of the problem determines how outliers are used
Outliers should be part of the training dataset but should not be present in the test data.
Outliers should be part of the test dataset but should not be present in the training data.
Outliers should be identified and removed from a dataset.
The apriori property means

Select one:
To decrease the efficiency, do level-wise generation of frequent item sets
If a set cannot pass a test, its supersets will also fail the same test
If a set can pass a test, its supersets will fail the same test
To improve the efficiency, do level-wise generation of frequent item sets

The average absolute difference between computed and actual outcome values.

Select one:
mean squared error
root mean squared error
mean absolute error
mean positive error

Use the three-class confusion matrix below to answer percent of the instances were correctly classified?

Computed Decision
C1 C2 C3
C1 10 5 3
C2 5 15 3
C3 2 2 5

Select one:
60
30
50
40

The correlation coefficient for two real-valued attributes is –0.85. What does this value tell you?

Select one:
The attributes are not linearly related.
The attributes show a linear relationship
As the value of one attribute decreases the value of the second attribute increases.
As the value of one attribute increases the value of the second attribute also increases.

Hierarchical agglomerative clustering is typically visualized as?

Select one:
Binary trees
Graph
Dendrogram
Block diagram
The most general form of distance is

Select one:
Minkowski
Eucledian
Manhattan
Mean

This data transformation technique works well when minimum and maximum values for a real-valued
attribute are known.

Select one:
z-score normalization
logarithmic normalization
min-max normalization
decimal scaling

Which of the following algorithm comes under the classification

Select one:
DBSCAN
K-nearest neighbor
Apriori
Brute forc

Which of the following is cluster analysis?

Select one:
Query results grouping
Grouping similar objects
Labeled classification
Simple segmentation

Time Complexity of k-means is given by

Select one:
O(tkn)
O(mn)
O(t2kn)
O(kn)

Clustering is ___________ and is example of ____________learning

Select one:
Predictive and supervised
Descriptive and supervised
Predictive and unsupervised
Descriptive and unsupervised
Which Association Rule would you prefer

Select one:
Low support and high confidence
High support and medium confidence
Low support and low confidence
High support and low confidence

_________ is an example for case based-learning

Select one:
K-nearest neighbor
Neural networks
Decision trees
Genetic algorithm

Arbitrary shaped clusters can be found by using

Select one:
Hierarchical methods
Agglomerative
Density methods
Partitional methods

1. Assume that we have a dataset containing information about 200 individuals. A supervised data
mining session has discovered the following rule:

IF age < 30 & credit card insurance = yes THEN life insurance = yes
Rule Accuracy: 70% and Rule Coverage: 63%

How many individuals in the class life insurance= no have credit card insurance and are less than 30
years old?
Select one:
70
38
30
63

S-ar putea să vă placă și