Sunteți pe pagina 1din 77

Machine Learning

Introduction
MR. U. A. NULI
COMPUTER SCIENCE AND ENGINEERING DEPARTMENT
TEXTILE AND ENGINEERING INSTITUTE, ICHALKARANJI
2

Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and


Engineering Institute, Ichalkaranji
3
History

Alan Turing, In his 1950 paper, “Computing Machinery and


Intelligence,” asked, “Can machines think?”

In 1959, Arthur Samuel def ned machine learning as, “Field of study
that gives computers the ability to learn without being explicitly
programmed.”
Samuel is credited with creating one of the self-learning computer
programs with his work at IBM.
He focused on games as a way of getting the computer to learn
things.
Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji
4
Definitions

"Machine learning is a scientific discipline that is concerned with the


design and development of algorithms that allow computers to evolve
behaviours based on empirical data, such as from sensor data or
databases."
-Wikipedia

"Machine learning is the training of a model from data that


generalizes a decision against a performance measure."
– Jason Brownlee
Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji
5
Definitions

A computer program is said to learn from experience E with respect to


some class of tasks T and performance measure P, if its performance at
tasks in T, as measured by P, improves with experience E.
-- Tom Mitchell

For example, a computer program that learns to play checkers might


improve its performance as measured by its ability to win at the class of
tasks involving playing checkers games, through experience obtained by
playing games against itself

Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji
6
Learning Problems

1. Checkers learning problem:


Task T: playing checkers
Performance measure P: percent of games won against opponents
Training experience E: playing practice games against itself

2. handwriting recognition learning problem


Task T: recognizing and classifying handwritten words within images
Performance measure P : percent of words correctly classified
Training experience E: a database of handwritten words with given
classifications
Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji
When Should You Use Machine 7
Learning?

1. Hand-written rules and equations are too complex—as in face recognition and
speech recognition

2. The rules of a task are constantly changing—as in fraud detection from transaction
records.

3. The nature of the data keeps changing, and the program needs to adapt—as in
automated trading, energy demand forecasting, and predicting shopping trends.

Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji
8
Machine Learning Applications

Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji
9
Machine Learning Applications

Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji
10
Machine Learning Applications

Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji
11
Machine Learning Applications

• Identification of unwanted spam messages in e-mail


• Segmentation of customer behaviour for targeted advertising
• Forecasts of weather behaviour and long-term climate changes
• Reduction of fraudulent credit card transactions
• Actuarial estimates of financial damage of storms and natural disasters
• Prediction of popular election outcomes
• Development of algorithms for auto-piloting drones and self-driving cars
• Optimization of energy use in homes and office buildings
• Projection of areas where criminal activity is most likely
• Discovery of genetic sequences linked to diseases

Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji
12
What is learning?

Learning is the process of acquiring new or modifying


existing knowledge, behaviour, skills, values, or preferences

Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji
13

Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and


Engineering Institute, Ichalkaranji
14
Machine Learning -Architecture

Model
Data (algorithms, Output
parameters)

Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji
15
Machine Learning -Architecture

Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji
16

Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and


Engineering Institute, Ichalkaranji
Detailed Diagram:

17

Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and


Engineering Institute, Ichalkaranji
18
What is learning?

The following are some considerations to define a learning problem:

1. Provide a definition of what the learner should learn and the need
for learning.
2. Define the data requirements and the sources of the data.
3. Define if the learner should operate on the dataset in entirety or a
subset will do.

As a first step, the given data is segregated into three datasets: training, validation,
and testing. There is no one hard rule on what percentage of data should be training,
validation, and testing datasets. It can be 70-10-20, 60-30-10, 50-25-25, or any other
values.
Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji
19
Components of Machine Learning
System - Data

• Data forms the main source of learning in Machine learning

• Data is a representation of human experience in machine learning system.

• Data can be any format – structured, semi-structured and unstructured


• Data can be received at any frequency, can be static or dynamic
• Data can be of any size
• Data can have any dimensions (number of features or attributes)

Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji
20
Terms related to data

Term Purpose or meaning in the context of Machine Learning


Feature, This is a single column of data being referenced by the
attribute, field, learning algorithms. Some features can be input to the
or variable learning algorithm, and some can be the outputs.
Instance This is a single row of data in the dataset.
Feature This is a list of features
vector or tuple
Dimension This is a subset of attributes used to describe a property
of data. For example, a date dimension consists of three
attributes: day, month, and year.
Dataset A collection of rows or instances is called a dataset.
Machine learning has different types of datasets that are meant to be used for
different purposes. These are: Training, Testing and evaluation datasets
Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji
21
Terms related to data

Term Purpose or meaning in the context of Machine Learning


Training The training dataset is the dataset that is the base dataset
Dataset against which the model is built or trained.
Testing The testing dataset is the dataset that is used to validate
Dataset the model built. This dataset is also referred to as a
validating dataset.
Evaluation The evaluation dataset is the dataset that is used for final
Dataset verification of the model (and can be treated more as user
acceptance testing).

Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji
22
Dataset

Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji
23
Model

A simplified description, especially a mathematical one, of a system


or process, to assist calculations and predictions.
--oxford dictionary

‘a statistical model used for predicting the survival rates of


endangered species’

mathematical model : a representation in mathematical terms of the


behaviour of real devices and objects

Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji
24
Machine Learning Model:

Input Data Feature Features Model: Output Prediction


Extraction Algorithms,
Classification
Parameters,
Etc.
Knowledge representation

Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji
25
Categories of models:

• Logical models

• Geometric Model

• Probabilistic Model

Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji
26
House price

Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji
27

Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji
28
Logical Model

Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji
29

Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji
30
Customer feedback for Shoes

Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji
31

Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji
32

Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji
33
Car classification

Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji
34
Types of learning problems

Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji
35
Machine Learning Techniques:

A technique is a way of solving a problem.

For example, classification is a technique for grouping things that are similar.

To actually do classification on some data, a data scientist would have to employ a


specific algorithm like Decision Trees (though there are many other classification
algorithms to choose from).

Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji
36
Machine Learning Techniques:

• Supervised Learning

• Unsupervised Learning

• Reinforcement Learning

Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji
37
Supervised Learning:

Supervised learning is similar to human learning in presence of Supervisor or Teacher.

Supervisor/Teacher’s roll is to provide correct feedback to learner.

Example: Teacher shows set of dog’s images and informs student that these are of Dogs.

student learns from the images the animal called DOG

What student understands is the properties of dogs that identifies it as dog like
its face, color, voice, etc.

Parent show the child animals like dogs, cats and help them to recognize them.
Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji
38
Supervised Learning:

In Machine Learning, A machine learning model learns from given examples presented
in the form of data.

Input to machine learning model is data and its various attributes are properties through
which model learns.

Similar to teacher in human learning, Along with the data the correct output is also
provided that helps model to learn.

Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji
39
Supervised Learning: Examples

Data Values Output


Attributes(Properties)
Model Learns as
Color Brown Dog
Dog
Height 24 inch
Legs 4

Data Values Output


Attributes(Properties)
No of wheels 4 Vehicle- Car Model Learns as
gears 6 Vehicle- car
Max speed 200
Weight 800 Kg
Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji
40
Supervised Learning

Supervised learning occurs when an algorithm learns from example data and
associated target responses that can consist of numeric values or string labels, such as
classes or tags, in order to later predict the correct response when posed with new
examples.

The aim of supervised machine learning is to build a model that makes predictions based on
evidence in the presence of uncertainty.

A supervised learning algorithm takes a known set of input data and known responses to the
data (output) and trains a model to generate reasonable predictions for the response to new
data.

Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji
41
Supervised Learning - Algorithms

List of Common Algorithms in Supervised Learning:


• Nearest Neighbour Classifier
• Naive Bayes
• Decision Trees
• Linear Regression
• Support Vector Machines (SVM)
• Neural Networks

Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji
42
Unsupervised Learning:

This is learning without teachers

Its learning a new concept comparing it with another concept.

This is basically human’s ability to group similar elements

Examples:
humans group banana, apple, orange, etc as fruits because they are from trees
and eaten without cooking or any other processing. (hence common attributes
among these are “ grown on tree” and “eaten without cooking”)

Humans group notebooks, pen, books, pencil as school stationary because these
are useful in school. (hence a common attribute among these is “ useful in school”)
Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji
43
Unsupervised Learning:

it resembles the methods humans use to figure out that certain objects or events are
from the same class, such as by observing the degree of similarity between objects.

Important characteristics of unsupervised learning is to find similarity between two


Events or objects.

Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji
44
Unsupervised Learning:

Unsupervised learning finds hidden patterns or intrinsic structures in data.

It is used to draw inferences from datasets consisting of input data without labeled
(unlabelled) responses.

Unsupervised learning occurs when an algorithm learns from plain examples without
any associated response, leaving to the algorithm to determine the data patterns on
its own.

Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji
45
Unsupervised Learning:

Grouping of six fruits given below:

Fruit Common Attribute - color Category(Group)


Mango Yellow Ripe Fruit
banana Yellow
guava Yellow
Mango Green Raw Fruit
banana Green
guava Green

Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji
46
Unsupervised Learning Examples:

Find high crime area for setting up patrol vans

Find maximum accident prone areas for setting up emergency care wards of an hospital

Grouping documents into different categories/topics based on the words used in the documents

Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji
47
Unsupervised Learning

Unsupervised learning is where you only have input data (X) and no corresponding
output variables.

The goal for unsupervised learning is to model the underlying structure or distribution in
the data in order to learn more about the data.

These are called unsupervised learning because unlike supervised learning above there
is no correct answers and there is no teacher. Algorithms are left to their own devises to
discover and present the interesting structure in the data.

Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji
48

Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji
49
Model Performance:

Is the solution created good?

- Model may or may not give accurate results

- If the model is not giving accurate result, how to measure error?

Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji
50
Measuring Prediction Performance

If a machine learning model is predicting house prices in a city, then how


much accurately it Is predicting it?

Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji
51
Measuring error(error metrics):
For prediction type models

1. Mean Square Error(MSE)

If MSE is zero or close to zero,


model is predicting the value
Accurately.
Pi - Predicted value of the ith record
Larger MSE value indicate poor
A i - Actual value of the ith record
Model Performance and needs
n - Total Records
Further training

ItMr. is also common to use the square root of this quantity called root mean square
U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji
52
Measuring error(error metrics):
For prediction type models

Mean absolute error (MAE):

Pi - Predicted value of the ith record

A i - Actual value of the ith record

n - Total Records
Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji
53
Measuring error(error metrics):
For classification type models

Confusion Matrix(Error Matrix) in Machine Learning :

• A confusion matrix is a table that is often used to describe the


performance of a classification model (or “classifier”) on a set of test
data for which the true values are known.

• It allows the visualization of the performance of an algorithm.

• It allows easy identification of confusion between classes e.g. one class is


commonly mislabelled as the other.

• Most performance measures are computed from the confusion matrix.


Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji
54

A confusion matrix is a summary of prediction results on a classification problem.

The number of correct and incorrect predictions are summarized with count values and broken
down by each class. This is the key to the confusion matrix.

The confusion matrix shows the ways in which your classification model is confused when it
makes predictions.

It gives us insight not only into the errors being made by a classifier but more importantly the
types of errors that are being made.

Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji
55

Actual class

Cat Non-cat

Cat 5 True Positives 2 False Positives


Predicted
class
Non-cat 3 False Negatives 17 True Negatives

Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji
56

Predicting Behavior of 10000 customers

Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji
57

Assuming a sample of 27 animals — 8 cats, 6 dogs, and 13 rabbits, the


resulting confusion matrix could look like the table below:

Actual class

Cat Dog Rabbit

Cat 5 2 0
Predicted
Dog 3 3 2
class
Rabbit 0 1 11

Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji
58
Terms in error measurement

Definition of the Terms:


• Positive (P) : Observation is positive (for example: is an apple).
• Negative (N) : Observation is not positive (for example: is not an apple).
• True Positive (TP) : Observation is positive, and is predicted to be positive.
• False Negative (FN) : Observation is positive, but is predicted negative.
• True Negative (TN) : Observation is negative, and is predicted to be negative.
• False Positive (FP) : Observation is negative, but is predicted positive.

Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji
59
Error measurement metrics

Classification Rate/Accuracy:

Classification Rate or Accuracy is given by the relation:

Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji
60
Error measurement metrics

Recall:
Recall can be defined as the ratio of the total number of correctly classified positive
examples divide to the total number of positive examples.

High Recall indicates the class is correctly recognized (small number of FN).
Recall is given by the relation:

Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji
61
Error measurement metrics

Precision:
To get the value of precision we divide the total number of correctly classified
positive examples by the total number of predicted positive examples.

High Precision indicates an example labelled as positive is indeed positive (small


number of FP).
Precision is given by the relation:

Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji
62
Metrics used with confusion matrix

Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji
63
Feature engineering:

Feature: A feature is an attribute of data that is meaningful to the machine learning process.

Feature engineering is the process of using domain knowledge of the data to create features that
make machine learning algorithms work.

Performance of Machine Learning algorithms depends on quality of input data and hence
features.

Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji
64

Data scientist spend 50% of their time on features

Why Feature Engineering?

- Feature may exist with lot of problems like missing values, outliers, different types, error
in data collection.

Before using features to train machine learning model, it is necessary to clean, transform
And select right set of features.

Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji
65
Feature engineering:

Feature engineering is the process of transforming data into features that better
represent the underlying problem, resulting in improved machine learning
performance

Benefits of spending time in feature engineering:

1. Model becomes simple due to selected and limited features.


2. It performs faster than complex model with large number of features.
3. Reduces model selection time, since limited features give better insight into data
relationship.
4. Reduces training time.

Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji
66
Feature engineering: - Feature
Extraction

Feature Extraction is a process of selecting new features from existing features or raw
data by carrying out some transformation or using some extraction procedure in order to
Reduce redundancy in the features.

Primary goals of feature extraction is to reduce redundancy in feature and dimensions of


Feature vector.

Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji
67
Feature engineering: - Preparing Data

Preparing Data takes into account capturing data, storing data, cleaning
data, Organizing data and so on.

Cleaning refers to the process of transforming data into a format that can
be easily interpreted by databases.

Organizing generally refers to a more radical transformation.

Organizing tends to involve changing the entire format of the dataset


into a much neater format, such as transforming raw chat logs into a
tabular row/column structure.
Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji
68
Data Cleaning

Does the data match the column label?

** Are there names in the names columns, addresses in the addresses columns, phone
numbers in the phone numbers column? Or is there different data in the columns?

Does the data abide by the appropriate rules for its field?

** Are the characters in a name only alphabetical (Brendan) or are there numbers in it
(B4rendan)?
** Is the numerical portion of a phone number 10 digits (5558675309) or not (675309)?

how many values are nulls? Is the number of nulls acceptable? Is there a pattern as to where
there are null values?
Are there duplicates and is that okay?
Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji
69

Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and


Engineering Institute, Ichalkaranji
70
Data cleaning – removing unwanted
observations

Unwanted Observations - duplicate or irrelevant observations

Duplicate observations/records –

Duplicate observations most frequently arise during data collection, such


as when you:
• Combine datasets from multiple places
• Scrape data
• Receive data from clients/other departments

Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji
71
Data cleaning- Handling Missing data

City Temperature Humidity


If a feature having most of missing values then
It can be removed from dataset Kolhapur 29 80
Ichalkaranji 28 -
Example: Humidity
Sangli - -
If a row having missing values it can be removed Miraj - -
Completely Karad 27 -
Example: sangli, miraj
Pune 23 -
A missing value can be replaced by mean/median Mumbai 26 90
value of that feature or zero value. Satara - -
Benglore 23 -

Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji
72
Data organization – Representing
Features

Handling Numerical Features:

A Numerical Feature holds numerical value.

Example: Cost of a product, temperature, size of a house, distance, etc..

Certain Numerical features can be accepted without any transformation whereas


Few needs transformation.

Rounding – Features can be rounded to integers or to few decimal places.


example: 0.234561 can be rounded to 0.235

Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji
73
Handling Numerical Features

Min- Max Normalization: this process brings the feature values in the range of 0 to 1

First it calculate min and max values of the feature and then transforms each value
By using formula

Xinew = ( Xi – min(X))/(max(x) – Min(x))

Not suitable when data contains outliers

Z-Score Normalization:

Z = ( X – mean(X))/StdDev(X)

Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji
74
Handling Numerical Features

Binning – a process of converting numerical value into categories or bins

Example: age – 5,12,7,9,25,32,21,35,37,45,53,67,74,83,99,123,125

This can be categories as


Age Category
0-10 1
11-20 2
21-30 3
> 91 10

This method reduces outlier effect


Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji
75
Machine Learning Process:

1. Define Problem
2. Collect data
3. Prepare Data
4. Split data in training validation and testing
5. Algorithm Selection
6. Training the algorithm
7. Evaluate Test Data
8. Parameter Tuning
9. Start Using the model

Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji
76

Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and


Engineering Institute, Ichalkaranji
77

End

Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji

S-ar putea să vă placă și