Sunteți pe pagina 1din 35

Machine Learning

Chapter 1: Introduction

Prof. Antonio Muñoz
Prof. José Portela
2018

Machine Learning
Prof. Antonio Muñoz & Prof. José Portela
1
Contents
1. Course syllabus and schedule
2. Data Mining & Machine Learning
3. The learning process
4. Types of learning
5. Ressources

Machine Learning
Prof. Antonio Muñoz & Prof. José Portela
2
1

Course syllabus and schedule

Machine Learning
Prof. Antonio Muñoz & Prof. José Portela
3
Course syllabus and Schedule
Organization
• Time: 2 sessions per week, 3pm to 5pm
• Place: Room 401
• Dates: 7th Nov’17 – 1stFeb’18
• Instructors: Antonio Muñoz 
amunoz@comillas.edu
https://www.iit.upcomillas.es/people/amunoz
José Portela
jportela@comillas.edu
https://www.iit.upcomillas.es/people/jportela
Guillermo Mestre
gmestre@comillas.edu
https://www.iit.upcomillas.es/people/gmestre
Machine Learning
Prof. Antonio Muñoz & Prof. José Portela
4
Course syllabus and Schedule 
Course description and grading
• Course description:
The objective of this introductory course to Machine Learning is to provide
students with a fundamental understanding and an extensive practical
experience of how to extract knowledge from an apparently unstructured set
of data.
• Course goals:
• Understand the basic principles behind Machine Learning.
• Gain practical experience with the most relevant Machine Learning 
algorithms.
• Have well‐formed criteria to choose the most appropriate technique for a 
given application.
• Grading:
• 15% Mid‐term exams
• 35% Final exam (≥4)
• 50% Lab

Machine Learning
Prof. Antonio Muñoz & Prof. José Portela
5
Course syllabus and Schedule 
Schedule
Session Date h/s SESSION THEORY LAB ASSESMENT
Lab Practice 1.1: Introduction to R for Machine
1 09-jan 2 Introduction I Introduction to Machine learning
Learning
Lac Practice 1.2: Introduction to R for Machine
2 11-jan 2 Introduction II
Learning
3 16-jan 2 Classification I The classification problem. Logistic regression.

4 18-jan 2 Classification II Lab Practice 2.1

5 23-jan 2 Classification III Discrimninant analysis. KNN Lab Practice 2.2

6 25-jan 2 Classification IV Decision trees Lab Practice 2.3

7 30-jan 2 Classification V SVM Lab Practice 2.4 Assignment 1

8 01-feb 2 Classification VI MLP Lab Practice 2.5

9 06-feb 2 Classification VII MLP Lab Practice 2.6 (hackathon)

10 08-feb 2 Regression I The regression problem. Linear Regression. Lab Practice 3.1 Assignment 2

11 13-feb 2 Regression II Model selection and Regularization Lab Practice 3.2

12 15-feb 2 Regression III Polynomial Regression, Splines, GAMs Lab Practice 3.3.
MLP, SVM. Ejemplo sintético generado a partir de los datos
13 20-feb 2 Regression IV Lab Practice 3.4
de mercado para que vean el efecto de las no linealidades
14 22-feb 2 Regression V Lab Practice 3.5 (assignment) Assignment 3

15 27-feb 2 Mid-term exam I Mid-term exam

16 01-mar 2 Forecasting I Stochastic Processes. Decomposition methods Lab Practice 4.1

17 06-mar 2 Forecasting II Exponential Smoothing Lab Practice 4.2

18 08-mar 2 Forecasting III ARMA Lab Practice 4.3

19 13-mar 2 Forecasting IV ARIMA Lab Practice 4.4

20 15-mar 2 Forecasting V SARIMA Lab Practice 4.5

21 20-mar 2 Forecasting VI Dynamic Regression models I Lab Practice 4.6 Assignment 4

22 22-mar 2 Forecasting VII Dynamic Regression models II Lab Practice 4.7

23 03-apr 2 Density estimation I Parametric & Non-parametric methods Lab Practice 5.1 Assignment 5

24 05-apr 2 Density estimation II NN for density estimation Lab Practice 5.2

25 10-apr 2 Dimensionality reduction PCA. ICA. Lab Practice 5.3

26 12-apr 2 Clustering I Hierarquical & partitional clustering Lab Practice 5.4

27 17-apr 2 Clustering II Vector Quantization. Neural Gas Lab Practice 5.5

28 19-apr 2 Self Organising Maps SOM Lab Practice 5.6

29 24-apr 2 Course summary Assignment 6

30 26-apr 2 Final exam Final exam

Machine Learning
Prof. Antonio Muñoz & Prof. José Portela
6
2

Data Mining & Machine Learning

Machine Learning
Prof. Antonio Muñoz & Prof. José Portela
7
Data Mining & Machine Learning
Motivation for DM & ML
• We are in a data rich but information poor situation.
• In general, decision makers do not have the tools to 
extract the valuable knowledge  embedded in the vast 
amounts of data.

Harvard article 1 Harvard article 2 TED Talk


Machine Learning
Prof. Antonio Muñoz & Prof. José Portela
8
Data Mining & Machine Learning
The evolution of database system technology

From (Han et al., 2012)
Machine Learning
Prof. Antonio Muñoz & Prof. José Portela
9
Data Mining & Machine Learning
Definitions
• Data Mining: 
• Knowledge discovery from data (KDD)
• Discovering patterns and associations in large data sets
• Turning data into information
• Uncover valuable information from the tremendous amounts of data and to transform
such data into organized knowledge

• Machine Learning: 
• Field of study that gives computers the ability to learn without being explicitly
programmed.
• Making computers to modify or adapt their actions so that these actions get more 
accurate.
• A machine is said to learn if it is able to take experience and utilize it such that its
performance improves up on similar experiences in the future.

Machine learning is sometimes conflated with data mining, where the latter sub‐field


focuses more on exploratory data analysis and is known as unsupervised learning.

Machine Learning
Prof. Antonio Muñoz & Prof. José Portela
10
3

The learning process

Machine Learning
Prof. Antonio Muñoz & Prof. José Portela
11
The learning process
The process

Abstraction Generalization
Data

Machine Learning
Prof. Antonio Muñoz & Prof. José Portela
12
The learning process
Data

Abstraction Generalization
Data

• The input data is the main source of knowledge.


• Its quality determines the quality of the final system.
• Requires observation, memory storage and recall.

Machine Learning
Prof. Antonio Muñoz & Prof. José Portela
13
The learning process
Abstraction

Abstraction Generalization
Data

• Abstraction is the translation of data into broader representations.


• During the abstraction process, we assign meaning to data by
representing knowledge using some kind of model:
• Equations
• Diagrams such as trees and graphs
• Logical if/else rules
• Groupings of data known as clusters

• The process of fitting a particular model to a dataset is known as 


training.
Machine Learning
Prof. Antonio Muñoz & Prof. José Portela
14
The learning process
Generalization

Abstraction Generalization
Data

• Generalization uses abstracted data to form a basis for action.


• A model is said to generalize if it produces correct outputs for cases 
not included in the training dataset.
• Measuring the generalization capabilities of a model is an essential
task.
• Our final objective is being able to generalize from a finite set of 
data.

Machine Learning
Prof. Antonio Muñoz & Prof. José Portela
15
The learning process
Steps to apply ML to your data
• Collecting data: several sources of 
information have to be integrated. 
Data Warehouses have become a 
very common data repository
architecture.
• Cleaning, exploring and preparing
the data: the quality of any ML 
project is based largely on the 
quality of the data it uses. 80% of 
the effort in ML is devoted to data.
• Identifying and training a model.
• Evaluating model performance: the 
generalization capabilities of the 
model have to be estimated.
• Improving model performance.

From (Han et al., 2012)
Machine Learning
Prof. Antonio Muñoz & Prof. José Portela
16
4

Types of learning

Machine Learning
Prof. Antonio Muñoz & Prof. José Portela
17
Types of Machine Learning
Introduction
• We can distinguish 3 types of learning:
• Supervised learning
• Unsupervised learning
• Reinforcement learning

• In addition we will cover in Machine Learning II a very special


type of learning which will be treated separately: 
• Evolutionary learning

Machine Learning
Prof. Antonio Muñoz & Prof. José Portela
18
Types of Machine Learning
Supervised learning
• The aim of Supervised Learning is to learn an input‐
output mapping from a “labelled” dataset. 

• Applications:
‐ Classification
‐ Regression
‐ Forecasting

Machine Learning
Prof. Antonio Muñoz & Prof. José Portela
19
Types of Machine Learning
Supervised learning: Classification

Machine Learning
Prof. Antonio Muñoz & Prof. José Portela
20
Types of Machine Learning
Supervised learning: Regression

Machine Learning
Prof. Antonio Muñoz & Prof. José Portela
21
Types of Machine Learning
Supervised learning: Forecasting
Electricity Price in the Spanish Day-ahead Market
90

80

70

60
Price (EUR/MWh)

50

40

30

20

10

0
17/12/14 05/02/15 27/03/15 16/05/15 05/07/15 24/08/15 13/10/15 02/12/15 21/01/16 11/03/16 30/04/16
Electricity Price in the Spanish Day-ahead Market
60

50

40
Price (EUR/MWh)

30

20

10

0
16/03/16 21/03/16 26/03/16 31/03/16 05/04/16 10/04/16 15/04/16 20/04/16 25/04/16 30/04/16
Machine Learning
Prof. Antonio Muñoz & Prof. José Portela
22
Types of Machine Learning
Unsupervised learning
• The aim is to find the regularities in the input 
data by discovering patterns.
• We want to characterize what generally
happens and what does not.

• Applications: 
‐ Density estimation
‐ Clustering
‐ Vector Quantization
‐ Dimensionality reduction
Machine Learning
Prof. Antonio Muñoz & Prof. José Portela
23
Types of Machine Learning
Unsupervised learning: Density estimation

Estimación de fdp(X) con modelo PRBFNII


12

0
10

4 -10
8
3

log10(p(X1,X2))
6 -20
X2

2
4 -30
1
2
-40
15
10 15
0 10
5
5
0 0
-5 -5
-2 X2
-2 0 2 4 6 8 10 12 X1
X1

Machine Learning
Prof. Antonio Muñoz & Prof. José Portela
24
Types of Machine Learning
Unsupervised learning: Clustering
Clustering example
10

6
X2

0
0 1 2 3 4 5 6 7 8 9 10
X1

Machine Learning
Prof. Antonio Muñoz & Prof. José Portela
25
Types of Machine Learning
Unsupervised learning: Clustering

Source: http://www.slideshare.net/kasunrangawijeweera/k‐means‐clustering‐algorithm
Machine Learning
Prof. Antonio Muñoz & Prof. José Portela
26
Types of Machine Learning
Unsupervised learning: Clustering

Source: http://www.slideshare.net/kasunrangawijeweera/k‐means‐clustering‐algorithm
Machine Learning
Prof. Antonio Muñoz & Prof. José Portela
27
Types of Machine Learning
Unsupervised learning: Vector Quantization
Estimación de fdp(X) con modelo PRBFNII
10

4
8
3
7

6
X2

5 2

3 1

1
0 1 2 3 4 5 6 7 8 9 10
Machine Learning X1
Prof. Antonio Muñoz & Prof. José Portela
28
Types of Machine Learning
Reinforcement learning
• In Reinforcement Learning the learner is a decision‐making agent 
that takes actions in an environment and receives rewards (or 
penalty) for its actions in trying to solve a problem. 
• After a set of trial‐and‐error runs, it should learn the best policy, 
which is the sequence of actions that maximize the total reward.
• “Learning with a critic” vs “Learning with a teacher”. 
• Applications:
• Robot control
• Games: chess, backgammon, checkers, …
• other activities that a software agent can learn

Video 1 The agent interacts with an environment. At any 
state of the environment, the agent takes an action 
Video 2 that changes the state and returns a reward. 
From (Alpaydin, 2014)
Machine Learning
Prof. Antonio Muñoz & Prof. José Portela
29
Types of Machine Learning
Evolutionary learning
• It is the study of computational systems which use 
ideas and get inspirations from natural evolution, 
such as reproduction, mutation, recombination, and 
selection.
• One of the principles borrowed is survival of the 
fittest.
• Evolutionary computation (EC) techniques can be 
used in optimization, learning and design.

Video

Machine Learning
Prof. Antonio Muñoz & Prof. José Portela
30
Types of Machine Learning
Evolutionary Programming

Mutation

Initialization Duplication Evaluation

Selection

Output

Machine Learning
Prof. Antonio Muñoz & Prof. José Portela
31
5

Ressources

Machine Learning
Prof. Antonio Muñoz & Prof. José Portela
32
Bibliography
• G. James, D. Witten, T. Hastie & R. Tibshirani (2013). An Introduction to Statistical Learning 
with Applications in R. Springer (see http://www‐bcf.usc.edu/~gareth/ISL/ )
• M. Kuhn & K. Johnson (2013). Applied Predictive Modeling. Springer
• T. Hastie, R. Tibshirani & J. Friedman (2009). The Elements of Statistical Learning. Data 
Mining, Inference and Prediction. 2nd Ed. Springer. 

• E. Alpaydin (2014). Introduction to Machine Learning. 3rd Ed. MIT Press


• S. Marsland (2015), Machine Learning: An Algorithmic Perspective, 2nd Ed., Chapman & 
Hall/Crc Machine Learning & Pattern Recognition. 
• T. Mitchell (1997). Machine Learning. McGraw‐Hill. 

• R. Duda, P. Hart & D. Stork (2000). Pattern Classification. 2nd Ed. Wiley‐Interscience. 
• C. Bishop (2007). Pattern Recognition and Machine Learning. Springer. 
• S. Haykin (1999). Neural Networks. A comprehensive foundation. 2nd Ed. Pearson. 
• W. Wei (2006).  Time Series Analysis. Univariate and Multivariate Methods. 2nd Ed. Addison‐
Wesley.
Machine Learning
Prof. Antonio Muñoz & Prof. José Portela
33
Links
• Courses:
• https://www.youtube.com/watch?v=UzxYlbK2c7E&list=PLJ_CMbwA6bT‐
n1W0mgOlYwccZ‐j6gBXqE
• https://www.youtube.com/watch?v=mbyG85GZ0PI&index=1&list=PLD63
A284B7615313A

• Repositories:
• https://www.kaggle.com/
• http://archive.ics.uci.edu/ml/
• http://work.caltech.edu/library/index.html

Machine Learning
Prof. Antonio Muñoz & Prof. José Portela
34
Alberto Aguilera 23, E-28015 Madrid - Tel: +34 91 542 2800 - http://www.iit.comillas.edu
Machine Learning
Prof. Antonio Muñoz & Prof. José Portela
35

S-ar putea să vă placă și