MachineLearning Ch1 Introd v1

Machine Learning
Chapter 1: Introduction
Prof. Antonio Muñoz
Prof. José Portela
2018
Machine Learning
Prof. Antonio Muñoz & Prof. José Portela
1
Contents
1. Course syllabus and schedule
2. Data Mining & Machine Learning
3. The learning process
4. Types of learning
5. Ressources
Machine Learning
2
1
Course syllabus and schedule
Machine Learning
3
Course syllabus and Schedule
Organization
• Time: 2 sessions per week, 3pm to 5pm
• Place: Room 401
• Dates: 7th Nov’17 – 1stFeb’18
• Instructors: Antonio Muñoz
amunoz@comillas.edu
https://www.iit.upcomillas.es/people/amunoz
José Portela
jportela@comillas.edu
https://www.iit.upcomillas.es/people/jportela
Guillermo Mestre
gmestre@comillas.edu
https://www.iit.upcomillas.es/people/gmestre
Machine Learning
4
Course description and grading
• Course description:
The objective of this introductory course to Machine Learning is to provide
students with a fundamental understanding and an extensive practical
experience of how to extract knowledge from an apparently unstructured set
of data.
• Course goals:
• Understand the basic principles behind Machine Learning.
• Gain practical experience with the most relevant Machine Learning
algorithms.
• Have well‐formed criteria to choose the most appropriate technique for a
given application.
• Grading:
• 15% Mid‐term exams
• 35% Final exam (≥4)
• 50% Lab
Machine Learning
5
Schedule
Session Date h/s SESSION THEORY LAB ASSESMENT
Lab Practice 1.1: Introduction to R for Machine
1 09-jan 2 Introduction I Introduction to Machine learning
Learning
Lac Practice 1.2: Introduction to R for Machine
2 11-jan 2 Introduction II
Learning
3 16-jan 2 Classification I The classification problem. Logistic regression.
4 18-jan 2 Classification II Lab Practice 2.1
5 23-jan 2 Classification III Discrimninant analysis. KNN Lab Practice 2.2
6 25-jan 2 Classification IV Decision trees Lab Practice 2.3
7 30-jan 2 Classification V SVM Lab Practice 2.4 Assignment 1
8 01-feb 2 Classification VI MLP Lab Practice 2.5
9 06-feb 2 Classification VII MLP Lab Practice 2.6 (hackathon)
10 08-feb 2 Regression I The regression problem. Linear Regression. Lab Practice 3.1 Assignment 2
11 13-feb 2 Regression II Model selection and Regularization Lab Practice 3.2
12 15-feb 2 Regression III Polynomial Regression, Splines, GAMs Lab Practice 3.3.
MLP, SVM. Ejemplo sintético generado a partir de los datos
13 20-feb 2 Regression IV Lab Practice 3.4
de mercado para que vean el efecto de las no linealidades
14 22-feb 2 Regression V Lab Practice 3.5 (assignment) Assignment 3
15 27-feb 2 Mid-term exam I Mid-term exam
16 01-mar 2 Forecasting I Stochastic Processes. Decomposition methods Lab Practice 4.1
17 06-mar 2 Forecasting II Exponential Smoothing Lab Practice 4.2
18 08-mar 2 Forecasting III ARMA Lab Practice 4.3
19 13-mar 2 Forecasting IV ARIMA Lab Practice 4.4
20 15-mar 2 Forecasting V SARIMA Lab Practice 4.5
21 20-mar 2 Forecasting VI Dynamic Regression models I Lab Practice 4.6 Assignment 4
22 22-mar 2 Forecasting VII Dynamic Regression models II Lab Practice 4.7
23 03-apr 2 Density estimation I Parametric & Non-parametric methods Lab Practice 5.1 Assignment 5
24 05-apr 2 Density estimation II NN for density estimation Lab Practice 5.2
25 10-apr 2 Dimensionality reduction PCA. ICA. Lab Practice 5.3
26 12-apr 2 Clustering I Hierarquical & partitional clustering Lab Practice 5.4
27 17-apr 2 Clustering II Vector Quantization. Neural Gas Lab Practice 5.5
28 19-apr 2 Self Organising Maps SOM Lab Practice 5.6
29 24-apr 2 Course summary Assignment 6
30 26-apr 2 Final exam Final exam
Machine Learning
6
2
Data Mining & Machine Learning
Machine Learning
7
Motivation for DM & ML
• We are in a data rich but information poor situation.
• In general, decision makers do not have the tools to
extract the valuable knowledge embedded in the vast
amounts of data.
Harvard article 1 Harvard article 2 TED Talk

Machine Learning
8
The evolution of database system technology
From (Han et al., 2012)
Machine Learning
9
Definitions
• Data Mining:
• Knowledge discovery from data (KDD)
• Discovering patterns and associations in large data sets
• Turning data into information
• Uncover valuable information from the tremendous amounts of data and to transform
such data into organized knowledge
• Machine Learning:
• Field of study that gives computers the ability to learn without being explicitly
programmed.
• Making computers to modify or adapt their actions so that these actions get more
accurate.
• A machine is said to learn if it is able to take experience and utilize it such that its
performance improves up on similar experiences in the future.
Machine learning is sometimes conflated with data mining, where the latter sub‐field

focuses more on exploratory data analysis and is known as unsupervised learning.
Machine Learning
10
3
The learning process
Machine Learning
11
The process
Abstraction Generalization
Data
Machine Learning
12
Data
Data
• The input data is the main source of knowledge.

• Its quality determines the quality of the final system.
• Requires observation, memory storage and recall.
Machine Learning
13
Abstraction
Data
• Abstraction is the translation of data into broader representations.

• During the abstraction process, we assign meaning to data by
representing knowledge using some kind of model:
• Equations
• Diagrams such as trees and graphs
• Logical if/else rules
• Groupings of data known as clusters
• The process of fitting a particular model to a dataset is known as

training.
Machine Learning
14
Generalization
Data
• Generalization uses abstracted data to form a basis for action.

• A model is said to generalize if it produces correct outputs for cases
not included in the training dataset.
• Measuring the generalization capabilities of a model is an essential
task.
• Our final objective is being able to generalize from a finite set of
data.
Machine Learning
15
Steps to apply ML to your data
• Collecting data: several sources of
information have to be integrated.
Data Warehouses have become a
very common data repository
architecture.
• Cleaning, exploring and preparing
the data: the quality of any ML
project is based largely on the
quality of the data it uses. 80% of
the effort in ML is devoted to data.
• Identifying and training a model.
• Evaluating model performance: the
generalization capabilities of the
model have to be estimated.
• Improving model performance.
From (Han et al., 2012)
Machine Learning
16
4
Types of learning
Machine Learning
17
Types of Machine Learning
Introduction
• We can distinguish 3 types of learning:
• Supervised learning
• Unsupervised learning
• Reinforcement learning
• In addition we will cover in Machine Learning II a very special

type of learning which will be treated separately:
• Evolutionary learning
Machine Learning
18
Supervised learning
• The aim of Supervised Learning is to learn an input‐
output mapping from a “labelled” dataset.
• Applications:
‐ Classification
‐ Regression
‐ Forecasting
Machine Learning
19
Supervised learning: Classification
Machine Learning
20
Supervised learning: Regression
Machine Learning
21
Supervised learning: Forecasting
Electricity Price in the Spanish Day-ahead Market
90
80
70
60
Price (EUR/MWh)
50
40
30
20
10
0
17/12/14 05/02/15 27/03/15 16/05/15 05/07/15 24/08/15 13/10/15 02/12/15 21/01/16 11/03/16 30/04/16
Electricity Price in the Spanish Day-ahead Market
60
50
40
Price (EUR/MWh)
30
20
10
0
16/03/16 21/03/16 26/03/16 31/03/16 05/04/16 10/04/16 15/04/16 20/04/16 25/04/16 30/04/16
Machine Learning
22
Unsupervised learning
• The aim is to find the regularities in the input
data by discovering patterns.
• We want to characterize what generally
happens and what does not.
• Applications:
‐ Density estimation
‐ Clustering
‐ Vector Quantization
‐ Dimensionality reduction
Machine Learning
23
Unsupervised learning: Density estimation
Estimación de fdp(X) con modelo PRBFNII

12
0
10
4 -10
8
3
log10(p(X1,X2))
6 -20
X2
2
4 -30
1
2
-40
15
10 15
0 10
5
5
0 0
-5 -5
-2 X2
-2 0 2 4 6 8 10 12 X1
X1
Machine Learning
24
Unsupervised learning: Clustering
Clustering example
10
6
X2
0
0 1 2 3 4 5 6 7 8 9 10
X1
Machine Learning
25
Source: http://www.slideshare.net/kasunrangawijeweera/k‐means‐clustering‐algorithm
Machine Learning
26
Source: http://www.slideshare.net/kasunrangawijeweera/k‐means‐clustering‐algorithm
Machine Learning
27
Unsupervised learning: Vector Quantization
Estimación de fdp(X) con modelo PRBFNII
10
4
8
3
7
6
X2
5 2
3 1
1
0 1 2 3 4 5 6 7 8 9 10
Machine Learning X1
28
Reinforcement learning
• In Reinforcement Learning the learner is a decision‐making agent
that takes actions in an environment and receives rewards (or
penalty) for its actions in trying to solve a problem.
• After a set of trial‐and‐error runs, it should learn the best policy,
which is the sequence of actions that maximize the total reward.
• “Learning with a critic” vs “Learning with a teacher”.
• Applications:
• Robot control
• Games: chess, backgammon, checkers, …
• other activities that a software agent can learn
Video 1 The agent interacts with an environment. At any
state of the environment, the agent takes an action
Video 2 that changes the state and returns a reward.
From (Alpaydin, 2014)
Machine Learning
29
Evolutionary learning
• It is the study of computational systems which use
ideas and get inspirations from natural evolution,
such as reproduction, mutation, recombination, and
selection.
• One of the principles borrowed is survival of the
fittest.
• Evolutionary computation (EC) techniques can be
used in optimization, learning and design.
Video
Machine Learning
30
Evolutionary Programming
Mutation
Initialization Duplication Evaluation
Selection
Output
Machine Learning
31
5
Ressources
Machine Learning
32
Bibliography
• G. James, D. Witten, T. Hastie & R. Tibshirani (2013). An Introduction to Statistical Learning
with Applications in R. Springer (see http://www‐bcf.usc.edu/~gareth/ISL/ )
• M. Kuhn & K. Johnson (2013). Applied Predictive Modeling. Springer
• T. Hastie, R. Tibshirani & J. Friedman (2009). The Elements of Statistical Learning. Data
Mining, Inference and Prediction. 2nd Ed. Springer.
• E. Alpaydin (2014). Introduction to Machine Learning. 3rd Ed. MIT Press

• S. Marsland (2015), Machine Learning: An Algorithmic Perspective, 2nd Ed., Chapman &
Hall/Crc Machine Learning & Pattern Recognition.
• T. Mitchell (1997). Machine Learning. McGraw‐Hill.
• R. Duda, P. Hart & D. Stork (2000). Pattern Classification. 2nd Ed. Wiley‐Interscience.
• C. Bishop (2007). Pattern Recognition and Machine Learning. Springer.
• S. Haykin (1999). Neural Networks. A comprehensive foundation. 2nd Ed. Pearson.
• W. Wei (2006). Time Series Analysis. Univariate and Multivariate Methods. 2nd Ed. Addison‐
Wesley.
Machine Learning
33
Links
• Courses:
• https://www.youtube.com/watch?v=UzxYlbK2c7E&list=PLJ_CMbwA6bT‐
n1W0mgOlYwccZ‐j6gBXqE
• https://www.youtube.com/watch?v=mbyG85GZ0PI&index=1&list=PLD63
A284B7615313A
• Repositories:
• https://www.kaggle.com/
• http://archive.ics.uci.edu/ml/
• http://work.caltech.edu/library/index.html
Machine Learning
34
Alberto Aguilera 23, E-28015 Madrid - Tel: +34 91 542 2800 - http://www.iit.comillas.edu
Machine Learning
35

MachineLearning Ch1 Introd v1

Încărcat de

Informații document

Drepturi de autor

Formate disponibile

Partajați acest document

Partajați sau inserați document

Opțiuni de partajare

Vi se pare util acest document?

Este necorespunzător acest conținut?

Drepturi de autor:

Formate disponibile

MachineLearning Ch1 Introd v1

Încărcat de

Drepturi de autor:

Formate disponibile

Machine Learning

4 18-jan 2 Classification II Lab Practice 2.1

5 23-jan 2 Classification III Discrimninant analysis. KNN Lab Practice 2.2

6 25-jan 2 Classification IV Decision trees Lab Practice 2.3

7 30-jan 2 Classification V SVM Lab Practice 2.4 Assignment 1

8 01-feb 2 Classification VI MLP Lab Practice 2.5

9 06-feb 2 Classification VII MLP Lab Practice 2.6 (hackathon)

11 13-feb 2 Regression II Model selection and Regularization Lab Practice 3.2

15 27-feb 2 Mid-term exam I Mid-term exam

16 01-mar 2 Forecasting I Stochastic Processes. Decomposition methods Lab Practice 4.1

17 06-mar 2 Forecasting II Exponential Smoothing Lab Practice 4.2

18 08-mar 2 Forecasting III ARMA Lab Practice 4.3

19 13-mar 2 Forecasting IV ARIMA Lab Practice 4.4

20 15-mar 2 Forecasting V SARIMA Lab Practice 4.5

21 20-mar 2 Forecasting VI Dynamic Regression models I Lab Practice 4.6 Assignment 4

22 22-mar 2 Forecasting VII Dynamic Regression models II Lab Practice 4.7

24 05-apr 2 Density estimation II NN for density estimation Lab Practice 5.2

25 10-apr 2 Dimensionality reduction PCA. ICA. Lab Practice 5.3

26 12-apr 2 Clustering I Hierarquical & partitional clustering Lab Practice 5.4

27 17-apr 2 Clustering II Vector Quantization. Neural Gas Lab Practice 5.5

28 19-apr 2 Self Organising Maps SOM Lab Practice 5.6

29 24-apr 2 Course summary Assignment 6

30 26-apr 2 Final exam Final exam

Harvard article 1 Harvard article 2 TED Talk

Machine learning is sometimes conflated with data mining, where the latter sub‐field

The learning process

• The input data is the main source of knowledge.

• Abstraction is the translation of data into broader representations.

• The process of fitting a particular model to a dataset is known as

• Generalization uses abstracted data to form a basis for action.

• In addition we will cover in Machine Learning II a very special

Estimación de fdp(X) con modelo PRBFNII

Initialization Duplication Evaluation

• E. Alpaydin (2014). Introduction to Machine Learning. 3rd Ed. MIT Press

S-ar putea să vă placă și