Sunteți pe pagina 1din 105

Deep Learning and Robotics

Pieter Abbeel
UC Berkeley AI research and teaching (professor)
covariant.ai AI for robotic automation (founder)
gradescope.com AI for grading homework and exams (founder)
OpenAI AI research (advisor)
Preferred Networks AI commercialization across many industries (advisor)
Also on advisory board of many other AI and robotics companies
A bit about myself
n 2008: Stanford PhD (advisor: Andrew Ng) n 2014 – now: founder Gradescope
AI for grading homework, projects, exams
2008 – now: professor at Berkeley
n
n
n Used at over 500 schools
n Research: Director Robot Learning Lab
n 26 PhD students, 2 post-docs, 35 n 2016 – 2017: research scientist at OpenAI
undergraduate researchers

n Teaching n 2017 – now: founder covariant.ai


n CS188 Intro to Artificial Intelligence
n AI for robotic automation of manufacturing /
n CS287 Advanced Robotics warehousing / e-commerce / logistics
n Various
n Robot lab tours for kids (-> STEM) n 2017 – now: advisor Preferred Networks,
n Deep Learning tutorials / bootcamp(s) OpenAI, OffWorld, Dishcraft, TensorFlight,
n Exec / C-level lectures on recent trends and Traptic, onai, inzone.ai, livongo, …
advances in AI (2x / month)
n Verizon TV commercial

Pieter Abbeel -- UC Berkeley | Gradescope | Covariant.AI


Recent Headlines

Pieter Abbeel -- UC Berkeley | Gradescope | Covariant.AI


Recent Headlines

Pieter Abbeel -- UC Berkeley | Gradescope | Covariant.AI


Recent Headlines

Pieter Abbeel -- UC Berkeley | Gradescope | Covariant.AI


Recent Headlines

Pieter Abbeel -- UC Berkeley | Gradescope | Covariant.AI


Recent Headlines

Pieter Abbeel -- UC Berkeley | Gradescope | Covariant.AI


Recent Headlines

Pieter Abbeel -- UC Berkeley | Gradescope | Covariant.AI


Recent Headlines

[Macron (France), March 2018] Pieter Abbeel -- UC Berkeley | Gradescope | Covariant.AI


Some Very High Valuations

Pieter Abbeel -- UC Berkeley | Gradescope | Covariant.AI


Some Very High Valuations

Pieter Abbeel -- UC Berkeley | Gradescope | Covariant.AI


High Valuations in China

Pieter Abbeel -- UC Berkeley | Gradescope | Covariant.AI


High Valuations in Japan

Pieter Abbeel -- UC Berkeley | Gradescope | Covariant.AI


Life of Professor in AI
Before 2015 After 2015

MARS (Bezos) NeurIPS 2017: 8000 attendees

Pieter Abbeel -- UC Berkeley | Gradescope | Covariant.AI


NIPS Conference September 4, 2018

NIPS sold out faster than


most concerts

Pieter Abbeel – covariant.ai | UC Berkeley | gradescope.com


10/25/2018 Sold at Christies for $432,500
Pieter Abbeel -- UC Berkeley | Gradescope | Covariant.AI
Hype?
n Yes
n But also fundamental advances being made

Pieter Abbeel -- UC Berkeley | Gradescope | Covariant.AI


Number of arxiv papers submitted in AI categories

[source: https://medium.com/@karpathy/a-peek-at-trends-in-machine-learning-ab8a1085a106] Pieter Abbeel -- UC Berkeley | Gradescope | Covariant.AI


Outline
n Deep learning successes
n Supervised learning
n Reinforcement learning
n Unsupervised learning

n Example applications

n Parting thoughts

Pieter Abbeel -- UC Berkeley | Gradescope | Covariant.AI


Why is AI hard to build? E.g. vision?

“Coffee Mug”

Pixel Intensity

Pixel intensity is a very poor representation.

[slide from Adam Coates] Pieter Abbeel -- UC Berkeley | Gradescope | Covariant.AI


Object Detection in Computer Vision
n State-of-the-art object detection until 2012:
Support “cat”
Hand-engineered
Input Vector “dog”
features (SIFT,
Image Machine “car”
HOG, DAISY, …)
(SVM) …

n Deep Supervised Learning (Krizhevsky, Sutskever, Hinton 2012; also LeCun, Bengio, Ng, Darrell, …):
“cat”
Input 8-layer neural network with 60 million “dog”
Image parameters to learn “car”

n ~1.2 million training images from ImageNet [Deng, Dong, Socher, Li, Li, Fei-Fei, 2009]

Pieter Abbeel -- UC Berkeley | Gradescope | Covariant.AI


Many Layer Neural Network

cat
car
dog
nothing

different weights à different computation


Neural Net Training: Find the weights that minimize the difference between labels and activation.
Neural Net Learning Image Recognition
n Training / Learning:
n Cars:

n Cats: labeled
data
n Dogs:

Machine Learning

n Test time: à à Label = Cat? Dog? Car?


Pieter Abbeel -- UC Berkeley | Gradescope | Covariant.AI
Performance

graph credit Matt


Zeiler, Clarifai
Performance

graph credit Matt


Zeiler, Clarifai
Performance

AlexNet

graph credit Matt


Zeiler, Clarifai
Performance

AlexNet

graph credit Matt


Zeiler, Clarifai
Performance

AlexNet

graph credit Matt


Zeiler, Clarifai
MS COCO Image Captioning Challenge

Karpathy & Fei-Fei, 2015; Donahue et al., 2015; Xu et al, 2015; many more
Pieter Abbeel -- UC Berkeley | Gradescope | Covariant.AI
Visual QA Challenge
Stanislaw Antol, Aishwarya Agrawal, Jiasen Lu, Margaret Mitchell, Dhruv Batra, C. Lawrence Zitnick, Devi Parikh

Pieter Abbeel -- UC Berkeley | Gradescope | Covariant.AI


Change in Programming Paradigm!
Traditional Programming: Deep Learning (“Software 2.0”)
program by writing lines of code program by providing data

? ?

Poor performance Success!

Pieter Abbeel -- UC Berkeley | Gradescope | Covariant.AI


Speech Recognition

graph credit Matt Zeiler, Clarifai


Machine Translation
Google Neural Machine Translation (in production)

Pieter Abbeel -- UC Berkeley | Gradescope | Covariant.AI


History

(Olshausen, 1996)

2000s Sparse, Probabilistic, and Energy models (Hinton, Bengio, LeCun, Ng)

Is deep learning 3, 30, or 60 years old?

based on history by K. Cho Rosenblatt’s Perceptron


Why Now?
n Data
n Compute
n AI innovation

Pieter Abbeel -- UC Berkeley | Gradescope | Covariant.AI


Outline
n Deep learning successes
n Supervised learning = pattern recognition, if enough data (input -> output pairs),
then neural net can learn the pattern

n Reinforcement learning
n Unsupervised learning

n Example applications

n Parting thoughts

Pieter Abbeel -- UC Berkeley | Gradescope | Covariant.AI


Deep Supervised Learning – How to get started?
n fast.ai
n deeplearning.ai (Andrew Ng)
n cs231n.stanford.edu (Andrej Karpathy et al)
n fullstackdeeplearning.com (Pieter Abbeel [Berkeley,
Covariant], Sergey Karayev [Gradescope], Josh Tobin [OpenAI],
Andrej Karpathy [Tesla], Yangqing Jia [Facebook], Jairam
Ranganathan [Uber], Lukas Biewald [W&B])

Pieter Abbeel -- UC Berkeley | Gradescope | Covariant.AI


Outline
n Deep learning successes
n Supervised learning = pattern recognition, if enough data (input -> output pairs),
then neural net can learn the pattern

n Reinforcement learning
n Unsupervised learning

n Example applications

n Parting thoughts

Pieter Abbeel -- UC Berkeley | Gradescope | Covariant.AI


Outline
n Deep learning successes
n Supervised learning = pattern recognition, if enough data (input -> output pairs), then
neural net can learn the pattern

n Reinforcement learning / AI with goals


n Unsupervised learning

n Example applications

n Parting thoughts

Pieter Abbeel -- UC Berkeley | Gradescope | Covariant.AI


PR-1

[Wyrobek, Berger, van der Loos, Salisbury, ICRA 2008] Pieter Abbeel -- UC Berkeley | Gradescope | Covariant.AI
AI with Objectives/Goals
n Robotics

n Marketing /
perceive Advertising

n Dialogue

n Optimizing
operations /
act logistics

n Queue
management

n …

Pieter Abbeel -- UC Berkeley | Gradescope | Covariant.AI


From Pixels to Actions?

Pong Enduro Beamrider Q*bert

Pieter Abbeel -- UC Berkeley | Gradescope | Covariant.AI


Deep Q-Network (DQN): From Pixels to Joystick Commands

32 8x8 filters with stride 4 + ReLU


64 4x4 filters with stride 2 + ReLU
64 3x3 filters with stride 1 + ReLU
fully connected 512 units + ReLU [Source: Mnih et al., Nature 2015 (DeepMind) ]
fully connected output units, one per action Pieter Abbeel -- UC Berkeley | Gradescope | Covariant.AI
[ Source: Mnih et al., Nature 2015 (DeepMind) ]

Pieter Abbeel -- UC Berkeley | Gradescope | Covariant.AI


Deep RL Success Stories

DQN Mnih et al, NIPS 2013 / Nature 2015


MCTS Guo et al, NIPS 2014; TRPO Schulman, Levine, Moritz, Jordan, Abbeel, ICML 2015; A3C Mnih et al,
ICML 2016; Dueling DQN Wang et al ICML 2016; Double DQN van Hasselt et al, AAAI 2016; Prioritized
Experience Replay Schaul et al, ICLR 2016; Bootstrapped DQN Osband et al, 2016; Q-Ensembles Chen et al,
2017; Rainbow Hessel et al, 2017; Accelerated Stooke and Abbeel, 2018; …
Pieter Abbeel -- UC Berkeley | Gradescope | Covariant.AI
Deep RL Success Stories

AlphaGo Silver et al, Nature 2015


AlphaGoZero Silver et al, Nature 2017
AlphaZero Silver et al, 2017
Tian et al, 2016; Maddison et al, 2014; Clark et al, 2015
Pieter Abbeel -- UC Berkeley | Gradescope | Covariant.AI
OpenAI’s 1v1 Dota [2017] and 5v5 [2018]
n Super-human agent on a competitive game, enabled by
n Reinforcement learning
n Self-play
n Enough computation

n Cooperation emerges

Pieter Abbeel -- UC Berkeley | Gradescope | Covariant.AI


Learning Locomotion

[Schulman, Moritz, Levine, Jordan, Abbeel, ICLR 2016] Pieter Abbeel -- UC Berkeley | Gradescope | Covariant.AI
Deep RL: Learn to Pass/Protect

[Bansal et al, 2017] Pieter Abbeel -- UC Berkeley | Gradescope | Covariant.AI


Deep RL: Learn Soccer

[Bansal et al, 2017] Pieter Abbeel -- UC Berkeley | Gradescope | Covariant.AI


Deep RL: Virtual Stuntman

[Peng, Abbeel, Levine, van de Panne, 2018] Pieter Abbeel -- UC Berkeley | Gradescope | Covariant.AI
Deep RL: Dynamic Animation for Motion Picture

[Peng, Abbeel, Levine, van de Panne, 2018] Pieter Abbeel -- UC Berkeley | Gradescope | Covariant.AI
BRETT: Berkeley Robot for the Elimination of Tedious Tasks

[Levine*, Finn*, Darrell, Abbeel, JMLR 2016] Pieter Abbeel -- UC Berkeley | Gradescope | Covariant.AI
Unsupervised Learning for Interaction?

[Levine et al, 2016] Pieter Abbeel -- UC Berkeley | Gradescope | Covariant.AI


Application: Google Datacenter Cooling

40% reduction in cooling cost

https://deepmind.com/blog/deepmind-ai-reduces-google-data-centre-cooling-bill-40/
Deep Reinforcement Learning -- NASA SUPERball

[Geng*, Zhang*, Bruce*, Caluwaerts, Vespignani, Sunspiral, Abbeel, Levine, ICRA 2017] Pieter Abbeel -- UC Berkeley | Gradescope | Covariant.AI
How About a Hand?

[OpenAI Robotics Team] Pieter Abbeel -- UC Berkeley | Gradescope | Covariant.AI


Speed Up Deep RL through Imitation

Pieter Abbeel -- UC Berkeley | Gradescope | Covariant.AI


Human Demonstrations

[Zhang, McCarthy, Jow, Lee, Chen, Goldberg, Abbeel, ICRA 2018] Pieter Abbeel -- UC Berkeley | Gradescope | Covariant.AI
Outline
n Deep learning successes
n Supervised learning = pattern recognition, if enough data (input -> output pairs), then
neural net can learn the pattern

n Reinforcement learning = learning goal-oriented behaviors from trial and error


n Unsupervised learning

n Example applications

n Parting thoughts

Pieter Abbeel -- UC Berkeley | Gradescope | Covariant.AI


Want to use Deep RL yourself?
n OpenAI Gym: https://gym.openai.com

n OpenAI baselines: https://github.com/openai/baselines

n rllab: https://github.com/rll/rllab

Pieter Abbeel -- UC Berkeley | Gradescope | Covariant.AI


Want to build Deep RL yourself?
n Deep RL Bootcamp [intense long weekend]
n 12 lectures by: Pieter Abbeel, Rocky Duan, Peter Chen, John Schulman,
Vlad Mnih, Chelsea Finn, Sergey Levine
n 4 hands-on labs
n https://sites.google.com/view/deep-rl-bootcamp/

n Deep RL Course [full semester]


n Originated by John Schulman, Sergey Levine, Chelsea Finn
n Latest offering (by Sergey Levine):
http://rll.berkeley.edu/deeprlcourse/
Pieter Abbeel -- UC Berkeley | Gradescope | Covariant.AI
Outline
n Deep learning successes
n Supervised learning = pattern recognition, if enough data (input -> output pairs), then
neural net can learn the pattern

n Reinforcement learning = learning goal-oriented behaviors from trial and error


n Unsupervised learning

n Example applications

n Parting thoughts

Pieter Abbeel -- UC Berkeley | Gradescope | Covariant.AI


Outline
n Deep learning successes
n Supervised learning = pattern recognition, if enough data (input -> output pairs), then
neural net can learn the pattern

n Reinforcement learning = learning goal-oriented behaviors from trial and error


n Unsupervised learning

n Example applications

n Parting thoughts

Pieter Abbeel -- UC Berkeley | Gradescope | Covariant.AI


Generative Models
n “What I cannot create, I do not understand.” -- Richard Feynman
n Ability to generate data that look real entails some form of understanding

[Radford, Metz & Chintala, ICLR 2016]


Initial “Images”

[Salimans, Goodfellow, Zaremba, Cheung, Radford & Chen, NIPS 2016] Pieter Abbeel -- UC Berkeley | Gradescope | Covariant.AI
Learning to Generate Images

[Salimans, Goodfellow, Zaremba, Cheung, Radford & Chen, NIPS 2016] Pieter Abbeel -- UC Berkeley | Gradescope | Covariant.AI
Progressive GAN: HighRes Images

[Karras, Aila, Laine & Lehtinen, 2017] Pieter Abbeel -- UC Berkeley | Gradescope | Covariant.AI
Unsupervised Image to Image

[CycleGAN: Zhu, Park, Isola & Efros, 2017] Pieter Abbeel -- UC Berkeley | Gradescope | Covariant.AI
Everybody Dance Now

[Everybody Dance Now: Caroline Chan, Shiry Ginosar, Tinghui Zhou, Alyosha Efros (UC Berkeley) 2018] Pieter Abbeel -- UC Berkeley | Gradescope | Covariant.AI
Neural “Photoshop”

[Brock, Lim, Ritchie & Weston, 2016] Pieter Abbeel -- UC Berkeley | Gradescope | Covariant.AI
Neural Compression 2

JPEG JPEG2000 WaveOne

[Rippel & Bourdev, 2017] Pieter Abbeel -- UC Berkeley | Gradescope | Covariant.AI


Beyond Images: Amazon Review

This product does what it is Great little item. Hard to put


supposed to. I always keep on the crib without some
three of these in my kitchen kind of embellishment. My
just in case ever I need a guess is just like the screw
replacement cord. kind of attachment I had.

[Radford et al, 2017] Pieter Abbeel -- UC Berkeley | Gradescope | Covariant.AI


Outline
n Deep learning successes
n Supervised learning = pattern recognition, if enough data (input -> output pairs), then
neural net can learn the pattern

n Reinforcement learning = learning goal-oriented behaviors from trial and error


n Unsupervised learning = learning structure of the world w/o explicit supervision

n Example applications

n Parting thoughts

Pieter Abbeel -- UC Berkeley | Gradescope | Covariant.AI


Outline
n Deep learning successes
n Supervised learning = pattern recognition, if enough data (input -> output pairs), then
neural net can learn the pattern

n Reinforcement learning = learning goal-oriented behaviors from trial and error


n Unsupervised learning = learning structure of the world w/o explicit supervision

n Example applications

n Parting thoughts

Pieter Abbeel -- UC Berkeley | Gradescope | Covariant.AI


Deep Supervised Learning Really Works!
n What does this give us?
n Automated prediction

Pieter Abbeel -- UC Berkeley | Gradescope | Covariant.AI


Importance of Automated Prediction?

Pieter Abbeel -- UC Berkeley | Gradescope | Covariant.AI


Yelp – Select Best Photos for Each Venue

BEFORE:

AFTER:

Pieter Abbeel -- UC Berkeley | Gradescope | Covariant.AI


Yahoo: emoji prediction
n Emoji use is highly
dynamic / context
specific
n Ideally desired one
within top-5

Pieter Abbeel -- UC Berkeley | Gradescope | Covariant.AI


Marketing
n ERIC STAHL, SVP AT SALESFORCE MARKETING CLOUD

n With Salesforce Einstein, marketers can gauge how likely it is a customer will engage
with an email, unsubscribe from an email list, or make a web purchase, and determine
what is driving true engagement to better anticipate the needs of every customer.

Pieter Abbeel -- UC Berkeley | Gradescope | Covariant.AI


Customer Support
n With fast growing sales, often difficult to keep up with
customer support hiring, and especially training
n AI can help managers detect which support conversations are
going the wrong way (e.g. negative sentiment) and they can
intervene + train on the job

Pieter Abbeel -- UC Berkeley | Gradescope | Covariant.AI


Pieter Abbeel -- UC Berkeley | Gradescope | Covariant.AI
Pieter Abbeel -- UC Berkeley | Gradescope | Covariant.AI
https://www.fda.gov/NewsEvents/Newsroom/PressAnnouncements/ucm604357.htm Pieter Abbeel -- UC Berkeley | Gradescope | Covariant.AI
Self-Driving Cars
Self-Driving Cars -- Stats

Pieter Abbeel -- UC Berkeley | Gradescope | Covariant.AI


Self-Driving Cars -- Stats

Pieter Abbeel -- UC Berkeley | Gradescope | Covariant.AI


Energy-Inference-Accuracy Landscape on the Squeezelator

ImageNet energy-accuracy for different


NNs
SqueezeNext vs
SqueezeNet/AlexNet
• 8% more accurate
*
• 2.25x better than SqueezeNet MobileNet
v1
• 7.5x better than AlexNet

[slide credit: Kurt Keutzer] Pieter Abbeel -- UC Berkeley | Gradescope | Covariant.AI


Existing Robotic Automation
n Robots perform programmed simple motions

Pieter Abbeel -- UC Berkeley | Gradescope | Covariant.AI


Two New Waves of Automation
n Wave 1: Robots with “eyes”
n Starting to happen now

n Wave 2: Teachable Robots (“get help anywhere, anytime”)


n Anticipated 5 years from now
n (not prime-time ready, but already starting to happen in research labs…)

Pieter Abbeel -- UC Berkeley | Gradescope | Covariant.AI


covariant.ai

Pieter Abbeel -- UC Berkeley | Gradescope | Covariant.AI


Outline
n Deep learning successes
n Supervised learning = pattern recognition, if enough data (input -> output pairs), then
neural net can learn the pattern

n Reinforcement learning = learning goal-oriented behaviors from trial and error


n Unsupervised learning = learning structure of the world w/o explicit supervision

n Example applications

n Parting thoughts

Pieter Abbeel -- UC Berkeley | Gradescope | Covariant.AI


Why Now? And is an AI winter coming?
n Data
n Compute
n AI expertise

Pieter Abbeel -- UC Berkeley | Gradescope | Covariant.AI


A (Short) History of AI
§ 1940-1950: Early days
§ 1943: McCulloch & Pitts: Boolean circuit model of brain
§ 1950: Turing's “Computing Machinery and Intelligence”
§ 1950—70: Excitement: Look, Ma, no hands!
§ 1950s: Early AI programs, including Samuel's checkers program,
Newell & Simon's Logic Theorist, Gelernter's Geometry Engine
§ 1956: Dartmouth meeting: “Artificial Intelligence” adopted
§ 1965: Robinson's complete algorithm for logical reasoning
§ 1970—90: Knowledge-based approaches
§ 1969—79: Early development of knowledge-based systems
§ 1980—88: Expert systems industry booms
§ 1988—93: Expert systems industry busts: “AI Winter”
§ 1990— 2012: Statistical approaches + subfield expertise
§ Resurgence of probability, focus on uncertainty
§ General increase in technical depth
§ Agents and learning systems… “AI Spring”?

§ 2012— ___: Excitement: Look, Ma, no hands again?


§ Big data, big compute, neural networks
§ Some re-unification of sub-fields
§ AI used in many industries
Pieter Abbeel -- UC Berkeley | Gradescope | Covariant.AI
Data

[Source: domo.com] Pieter Abbeel -- UC Berkeley | Gradescope | Covariant.AI


Compute: Moore’s Law

~ 2x
every 3 years

[20-April-2018]

Pieter Abbeel -- UC Berkeley | Gradescope | Covariant.AI


Compute: Neural Net Chip Development

100-1000x?

[20-April-2018]

Sidenote: training vs. inference Pieter Abbeel -- UC Berkeley | Gradescope | Covariant.AI


Compute: Ability to Compute over Many Machines

Source: OpenAI Pieter Abbeel -- UC Berkeley | Gradescope | Covariant.AI


Also, from an industry perspective
n Companies making (lots of) money from AI this time around…

Pieter Abbeel -- UC Berkeley | Gradescope | Covariant.AI


Now let’s take a look at scale
Architecture Num neurons Num synapses

Fly 100K = 105 10M = 107

AlexNet 650K = 106 60M = 108

Mouse 100M = 108 100B = 1011

Human 100B = 1011 1014 -1015

If each synapse is 1 FLOP (i.e., can fire / not fire once per second),
Then human brain requires 1015 flops = 1 petaflop.
10,000 current CPUs, costs $1000 / hr on Amazon’s EC2
Or 10 latest GPUs (Nvidia V100), costs $30 / hr on EC2 (2017) -- $25 / hr on GCP (2018)
AI and the World
Living with Superintelligence(s)
n Value Alignment n Bionic AI brain add-ons
n “We had better be quite sure that the n High bandwidth
purpose put into the machine is the
purpose which we really desire”
brain-machine interfaces
Norbert Wiener, 1960
King Midas circa 540 BCE

(from: humancompatible.ai at Berkeley) Glenn Northcut, Understanding Vertebrate Brain Evolution (2002)

Pieter Abbeel -- UC Berkeley | Gradescope | Covariant.AI


Want to stay updated on AI? (1/2)

Pieter Abbeel -- UC Berkeley | Gradescope | Covariant.AI


Want to stay updated on AI? (2/2)

n https://jack-clark.net/import-ai/
Pieter Abbeel -- UC Berkeley | Gradescope | Covariant.AI
Opportunities for Further Engagement
n Berkeley Robot Learning Lab + Berkeley AI Research Lab:
n Outreach -- rll_outreach@lists.berkeley.edu
n Industrial affiliates program -- pabbeel@cs.berkeley.edu

n Covariant.AI:
n AI solutions for your factory / warehouse / etc. -- pabbeel@covariant.ai

n Advising / Consulting / Training:


n AI input for your organization -- pabbeel@cs.berkeley.edu

n AI to simplify grading homework/exams:


n Get a (free) account at www.gradescope.com
Pieter Abbeel -- UC Berkeley | Gradescope | Covariant.AI

S-ar putea să vă placă și