Sunteți pe pagina 1din 32

5/10/2019 localhost:8000/slidesT493/intro.html?

print-pdf#/

PROBABILISTIC REASONING
AND

DECISION MAKING
ECE 493 T25
Instructor: Mark Crowley
Date: May 6, 2019
Room: E7 4043
localhost:8000/slidesT493/intro.html?print-pdf#/ 1/51
1
5/10/2019 localhost:8000/slidesT493/intro.html?print-pdf#/

ABOUT ME
Name: Mark Crowley (crohw-lee)
Pronouns: He/Him
Degree: PhD University of British Columbia 2011 in Computer Science
What do you call me?:
Whatever you feel comfortable with
Prof. Crowley, Prof. Mark, Prof, Mark, sir? -- all ne

localhost:8000/slidesT493/intro.html?print-pdf#/ 2/51
2
5/10/2019
PhD UBC 2011 in Computer Science localhost:8000/slidesT493/intro.html?print-pdf#/

Research on Arti cial Intelligence and Machine Learning


Anomaly Detection
Reinforcement Learning
Probabilistic Graphical Models
Data Reduction
Domains
Autonomus Driving/Driver Behaviour
Computational Sustainability (Forest Fires, Invasive Species,...)
Medical Imaging (Diagnosis and Search)
Embedded Systems Quality Monitoring
Physics and Chemistry
localhost:8000/slidesT493/intro.html?print-pdf#/ 4/51
3
5/10/2019 localhost:8000/slidesT493/intro.html?print-pdf#/

HOW DO YOU FIND ME?


on Learn discussion forum
by email mcrowley@uwaterloo.ca
in my of ce: E5 4114
of ce hours : Thursdays 4-5pm
by appointment
also twitter @compthink (especially if you nd something cool and want
to share with the class)

localhost:8000/slidesT493/intro.html?print-pdf#/ 5/51
4
5/10/2019 localhost:8000/slidesT493/intro.html?print-pdf#/

COURSE LOGISTICS
Lecture time : Mondays/Fridays 2:30pm-3:50pm in E7 4043
Tutorials : Wednesdays 5:30pm-6:30pm in E7 4043
Teaching Assistant(s):
Sriram Ganapathi Subramanian s2ganapa@uwaterloo.ca
...someone else (maybe more than one)
Of ce Hours:
Crowley (E5 4114) :Thursdays 4-5pm or by appointment
Sriram (TBD) : Tuesdays 4-5pm

localhost:8000/slidesT493/intro.html?print-pdf#/ 6/51
5
5/10/2019 localhost:8000/slidesT493/intro.html?print-pdf#/

WHAT IS THIS COURSE ABOUT?


So, is this course about AI or probability or what? yes

In this course we focus from the ground up on the
concepts and skills needed to build systems that can
reason, learn and make decisions in the presence of
uncertainty or using probabilities.

localhost:8000/slidesT493/intro.html?print-pdf#/ 9/51
6
5/10/2019 localhost:8000/slidesT493/intro.html?print-pdf#/

BUT WHAT IS AI?

...it stands for Arti cial Intelligence.


Sure, but what does it mean to you?
localhost:8000/slidesT493/intro.html?print-pdf#/ 13/51
7.1
5/10/2019 localhost:8000/slidesT493/intro.html?print-pdf#/

WHAT IS ARTIFICIAL INTELLIGENCE?


AI is about agents experiencing, learning, interacting and living in the
world
It's the pursuit of fundamentally hard problems,
even seemingly impossible problems ...
but problems for which we have at least one demonstration case... us

localhost:8000/slidesT493/intro.html?print-pdf#/ 19/51
7.2
5/10/2019 localhost:8000/slidesT493/intro.html?print-pdf#/

HOW THIS COURSE FITS IN


ECE 457A - Cooperative and Adaptive Algorithms
meta-heuristic optimization, search, genetic algorithms, ant colony,
swarm intelligence, constraint solvers
ECE 457B - Fundamentals of Computational Intelligence
soft computing, fuzzy logic, neural networks, deep learning
ECE 493 T21 - Autonomous Vehicles
deep learning, SLAM, path planning, some RL
ECE 493 T25 - Probabilistic Reasoning and Decision Making

localhost:8000/slidesT493/intro.html?print-pdf#/ 21/51
9
5/10/2019 localhost:8000/slidesT493/intro.html?print-pdf#/

THE MOTIVATION FOR PROBABILISTIC GRAPHICAL MODELS


The inherent complexity of probabilities
Simpli cation using conditional probability
Graphical Representations
Telling a story
Applications

localhost:8000/slidesT493/intro.html?print-pdf#/ 22/51
10
5/10/2019 localhost:8000/slidesT493/intro.html?print-pdf#/

MORE MOTIVATION
Interpretability
Small Data
Decision Making
Exciting AI Advances : Alpha Go

localhost:8000/slidesT493/intro.html?print-pdf#/ 23/51
11 . 1
5/10/2019 localhost:8000/slidesT493/intro.html?print-pdf#/

INTERPRETABILITY
You may want to build probabilistic models with meaning and bounds.
You may want to model causality.
You may want to learn structure of conditional and causal relationships
from data
If you want to do any of these things, you need to know how to model
those relationships. There is a rich set of tools for doing this which can't be
replaced with DNNs.

localhost:8000/slidesT493/intro.html?print-pdf#/ 24/51
11 . 2
5/10/2019 localhost:8000/slidesT493/intro.html?print-pdf#/

SMALL DATA
A lot of modern machine learning methods require massive amounts of
training data to work well
What if that's not available for your problem? What do you do?

localhost:8000/slidesT493/intro.html?print-pdf#/ 25/51
11 . 3
5/10/2019 localhost:8000/slidesT493/intro.html?print-pdf#/

DECISION MAKING
humans need good models to make decisions
well machines also need good models to make good decisions
so if you can't just trust a Neural Network in your domain then you
need to ability to build a probabilistic reasoning system
we are going to learn how to do that
You also want systems that learn how to make decisions
robotics
autonomous driving
advertisement targeting
game playing
localhost:8000/slidesT493/intro.html?print-pdf#/ 26/51
12
5/10/2019 localhost:8000/slidesT493/intro.html?print-pdf#/

ALPHA GO
AI has a history of competitions of Human vs Machine : Backgammon,
Chess (IBM Deep Blue), Poker
in March, 2016 Google's AlphaGo Documentary (with English subtitles) (2017)

Deep Mind carried out a


bold challenge: defeat a
world Master at the ancient
game of Go, Lee Sedol.
Go was always believed to
be the hardest game...not
anymore
The high level goal of this course is to give you some of the tools to
localhost:8000/slidesT493/intro.html?print-pdf#/ 27/51
5/10/2019
understand this event localhost:8000/slidesT493/intro.html?print-pdf#/

what happened?
why was it a big deal
how can you build something like that?

localhost:8000/slidesT493/intro.html?print-pdf#/ 28/51
13
5/10/2019 localhost:8000/slidesT493/intro.html?print-pdf#/

PART 1 : PROBABILISTIC MODELLING AND INFERENCE


Bayesian probability theory (review?)
Probabilistic Modelling (Bayesian vs Frequentist approaches,
conditional probability rules, Bayes rule, expectation, variance, etc.)
Methods of approximate inference:
marginal inference
Maximum a posteriori (MAP)
Monte-Carlo Markov Chain (MCMC) estimation

localhost:8000/slidesT493/intro.html?print-pdf#/ 29/51
14
5/10/2019 localhost:8000/slidesT493/intro.html?print-pdf#/

PART 2 : PROBABILISTIC REASONING TOOLS


We will explore two practical approaches to utilizing these concepts for
reasoning :
Probabilistic Graphical Models (PGMs)
Bayesian Networks (directed, conditional probabilities)
Markov Random Fields (undirected, correlations)
others (if time)
Probabilistic Programming
Probabilistic Inference
exact methods - variable elimination, junction trees
approximate methods - Gibbs sampling, MCMC
localhost:8000/slidesT493/intro.html?print-pdf#/ 30/51
15
5/10/2019 localhost:8000/slidesT493/intro.html?print-pdf#/

PART 3 : DECISION MAKING UNDER UNCERTAINTY


How can we use probabilistic models to allow an agent to optimize its
decisions based on data collected through experimentation or interaction
with its environment.

localhost:8000/slidesT493/intro.html?print-pdf#/ 31/51
16
5/10/2019 localhost:8000/slidesT493/intro.html?print-pdf#/

PART 4 : REINFORCEMENT LEARNING


A general framework for decision making where we
agents learn how to act from their environment without
any prior knowledge of how the world works or the
value of possible outcomes.

localhost:8000/slidesT493/intro.html?print-pdf#/ 32/51
17
5/10/2019 localhost:8000/slidesT493/intro.html?print-pdf#/

PART 4 : REINFORCEMENT LEARNING


CLASSIC RL:
Classic Reinforcement Learning (RL) (the Bellman equation, Value/Policy
Iteration, TD methods, Q-learning, SARSA, policy gradients, actor-critic
methods)
DEEP RL:

localhost:8000/slidesT493/intro.html?print-pdf#/ 33/51
5/10/2019
Basics of Neural Networks (training, back-propagation, gradient
localhost:8000/slidesT493/intro.html?print-pdf#/

descent, regularization methods)


Deep Learning (training methods, relevant architectures for
Reinforcement Learning, fully connected feed forward networks)
Function approximation for RL (classic methods, Deep Learning)
Deep RL : Deep Q-Networks (DQN), A3C, A2C, etc.

localhost:8000/slidesT493/intro.html?print-pdf#/ 34/51
18 . 1
5/10/2019 localhost:8000/slidesT493/intro.html?print-pdf#/

MATH ALL THE THINGS!


Bayes rule :
Marginal inference :
MAP inference :
Chain rule of probability :

Expectation :
Bellman equations :

Policy gradient :
localhost:8000/slidesT493/intro.html?print-pdf#/ 40/51
5/10/2019 localhost:8000/slidesT493/intro.html?print-pdf#/

LEARNING OBJECTIVES
By the end of this course I hope you will be able to :
1. Explain and apply the basic methods of Bayesian modelling including
inference to a given problem of moderate complexity.
2. Explain and apply the basic methods of Bayesian Optimization to
speci c problems.
3. Explain, design and implement Reinforcement Learning algorithms for
given problem descriptions

localhost:8000/slidesT493/intro.html?print-pdf#/ 42/51
20
5/10/2019 localhost:8000/slidesT493/intro.html?print-pdf#/

TEACHING STYLE
I will try to enable your learning.
I will not transmit truth at you where your job is to absorb it.
You need to work at it, you need to nd your way to understand the
concepts, and how they are useful to you.
This course provides the space for you to spend time on these topics:
but you should go beyond the material I present
there are lots of materials online about these topics, devour them

localhost:8000/slidesT493/intro.html?print-pdf#/ 43/51
21
5/10/2019 localhost:8000/slidesT493/intro.html?print-pdf#/

COURSE REFERENCES
Course Notes : on the web, updated throughout term
[Ermon2019] - First half of notes are based on Stanford CS 228
(https://ermongroup.github.io/cs228-notes/) which goes even more
into details on PGMs.
Other course references (speci c sections will be listed as needed)
[Cam Davidson 2018] - Bayesian Methods for Hackers - Probabilistic
Programming textbook as set of python notebooks.
http://www.cse.chalmers.se/~chrdimi/downloads/book.pdf
[Dimitrakakis2019] - Decision Making Under Uncertainty and
localhost:8000/slidesT493/intro.html?print-pdf#/ 44/51
5/10/2019
Reinforcement Learning localhost:8000/slidesT493/intro.html?print-pdf#/

[Ghavamzadeh2016] - Bayesian Reinforcement Learning: A Survey.


Ghavamzadeh et al. 2016.
https://camdavidsonpilon.github.io/Probabilistic-Programming-
and-Bayesian-Methods-for-Hackers/#contents
[Sutton and Barto 2018] - Reinforcement Learning: An Introduction
http://incompleteideas.net/book/the-book-2nd.html

localhost:8000/slidesT493/intro.html?print-pdf#/ 45/51
22
5/10/2019 localhost:8000/slidesT493/intro.html?print-pdf#/

EVALUATION

localhost:8000/slidesT493/intro.html?print-pdf#/ 46/51
5/10/2019 localhost:8000/slidesT493/intro.html?print-pdf#/

You will be evaluated only on the content of


what is presented in class,
the course notes and speci c sections of free online texts I point you
to if course notes not available,
and what you do in assignments.
If you do the work and answer everything correctly you will get the
grades.
If you go beyond these in your assignments it will be impressive and can
positively effect your evaluations.
Read the course guidelines about academic honesty, we will be vigilant
about cheating. If an assignment is individual the work you hand in must
be entirely your own work.
localhost:8000/slidesT493/intro.html?print-pdf#/ 47/51
23
5/10/2019 localhost:8000/slidesT493/intro.html?print-pdf#/

GRADE BREAKDOWN
PROBABILISTIC MODELLING
Assignment 1: 7.5% - Probability Theory, Bayesian estimation - writing a
document answering questions, some calculations - alone - probably a
python notebook or matlab
Assignment 2: 7.5%- Graphical Models and Probabilistic Programming -
alone - probably a python notebook or matlab
Midterm: 30% (June 21?)
REINFORCEMENT LEARNING
localhost:8000/slidesT493/intro.html?print-pdf#/ 48/51
5/10/2019
Assignment 3: 15% - larger implementations on standard RL problems,
localhost:8000/slidesT493/intro.html?print-pdf#/

in a team of ~2. Might set up at kaggle in class competition.


FINAL EXAM:
40% - all topics, leaning towards later topics

localhost:8000/slidesT493/intro.html?print-pdf#/ 49/51
24
5/10/2019 localhost:8000/slidesT493/intro.html?print-pdf#/

WRAP-UP
Tutorial : Wednesday 5:30pm (same room) - Sriram will do a Python and
Jupyter notebook introduction
Next Class : ...friday... probability review

localhost:8000/slidesT493/intro.html?print-pdf#/ 51/51
26

S-ar putea să vă placă și