Documente Academic
Documente Profesional
Documente Cultură
to
Machine
Learning
Third
Edition
Ethem Alpaydın
All rights reserved. No part of this book may be reproduced in any form by any
electronic or mechanical means (including photocopying, recording, or informa-
tion storage and retrieval) without permission in writing from the publisher.
Alpaydin, Ethem.
Introduction to machine learning / Ethem Alpaydin—3rd ed.
p. cm.
Includes bibliographical references and index.
ISBN 978-0-262-02818-9 (hardcover : alk. paper)
1. Machine learning. I. Title
Q325.5.A46 2014
006.3’1—dc23 2014007214
CIP
10 9 8 7 6 5 4 3 2 1
Brief Contents
1 Introduction 1
2 Supervised Learning 21
3 Bayesian Decision Theory 49
4 Parametric Methods 65
5 Multivariate Methods 93
6 Dimensionality Reduction 115
7 Clustering 161
8 Nonparametric Methods 185
9 Decision Trees 213
10 Linear Discrimination 239
11 Multilayer Perceptrons 267
12 Local Models 317
13 Kernel Machines 349
14 Graphical Models 387
15 Hidden Markov Models 417
16 Bayesian Estimation 445
17 Combining Multiple Learners 487
18 Reinforcement Learning 517
19 Design and Analysis of Machine Learning Experiments 547
A Probability 593
Contents
Preface xvii
Notations xxi
1 Introduction 1
1.1 What Is Machine Learning? 1
1.2 Examples of Machine Learning Applications 4
1.2.1 Learning Associations 4
1.2.2 Classification 5
1.2.3 Regression 9
1.2.4 Unsupervised Learning 11
1.2.5 Reinforcement Learning 13
1.3 Notes 14
1.4 Relevant Resources 17
1.5 Exercises 18
1.6 References 20
2 Supervised Learning 21
2.1 Learning a Class from Examples 21
2.2 Vapnik-Chervonenkis Dimension 27
2.3 Probably Approximately Correct Learning 29
2.4 Noise 30
2.5 Learning Multiple Classes 32
2.6 Regression 34
2.7 Model Selection and Generalization 37
2.8 Dimensions of a Supervised Machine Learning Algorithm 41
2.9 Notes 42
viii Contents
2.10 Exercises 43
2.11 References 47
4 Parametric Methods 65
4.1 Introduction 65
4.2 Maximum Likelihood Estimation 66
4.2.1 Bernoulli Density 67
4.2.2 Multinomial Density 68
4.2.3 Gaussian (Normal) Density 68
4.3 Evaluating an Estimator: Bias and Variance 69
4.4 The Bayes’ Estimator 70
4.5 Parametric Classification 73
4.6 Regression 77
4.7 Tuning Model Complexity: Bias/Variance Dilemma 80
4.8 Model Selection Procedures 83
4.9 Notes 87
4.10 Exercises 88
4.11 References 90
5 Multivariate Methods 93
5.1 Multivariate Data 93
5.2 Parameter Estimation 94
5.3 Estimation of Missing Values 95
5.4 Multivariate Normal Distribution 96
5.5 Multivariate Classification 100
5.6 Tuning Complexity 106
5.7 Discrete Features 108
5.8 Multivariate Regression 109
5.9 Notes 111
5.10 Exercises 112
Contents ix
7 Clustering 161
7.1 Introduction 161
7.2 Mixture Densities 162
7.3 k-Means Clustering 163
7.4 Expectation-Maximization Algorithm 167
7.5 Mixtures of Latent Variable Models 172
7.6 Supervised Learning after Clustering 173
7.7 Spectral Clustering 175
7.8 Hierarchical Clustering 176
7.9 Choosing the Number of Clusters 178
7.10 Notes 179
7.11 Exercises 180
7.12 References 182
A Probability 593
A.1 Elements of Probability 593
A.1.1 Axioms of Probability 594
A.1.2 Conditional Probability 594
A.2 Random Variables 595
A.2.1 Probability Distribution and Density Functions 595
A.2.2 Joint Distribution and Density Functions 596
A.2.3 Conditional Distributions 596
A.2.4 Bayes’ Rule 597
A.2.5 Expectation 597
A.2.6 Variance 598
A.2.7 Weak Law of Large Numbers 599
A.3 Special Random Variables 599
A.3.1 Bernoulli Distribution 599
A.3.2 Binomial Distribution 600
A.3.3 Multinomial Distribution 600
A.3.4 Uniform Distribution 600
A.3.5 Normal (Gaussian) Distribution 601
A.3.6 Chi-Square Distribution 602
A.3.7 t Distribution 603
A.3.8 F Distribution 603
A.4 References 603
Index 605