2018 12 Abbeel - AI PDF

Deep Learning and Robotics
Pieter Abbeel
UC Berkeley AI research and teaching (professor)
covariant.ai AI for robotic automation (founder)
gradescope.com AI for grading homework and exams (founder)
OpenAI AI research (advisor)
Preferred Networks AI commercialization across many industries (advisor)
Also on advisory board of many other AI and robotics companies
A bit about myself
n 2008: Stanford PhD (advisor: Andrew Ng) n 2014 – now: founder Gradescope
AI for grading homework, projects, exams
2008 – now: professor at Berkeley
n
n
n Used at over 500 schools
n Research: Director Robot Learning Lab
n 26 PhD students, 2 post-docs, 35 n 2016 – 2017: research scientist at OpenAI
undergraduate researchers
n Teaching n 2017 – now: founder covariant.ai

n CS188 Intro to Artificial Intelligence
n AI for robotic automation of manufacturing /
n CS287 Advanced Robotics warehousing / e-commerce / logistics
n Various
n Robot lab tours for kids (-> STEM) n 2017 – now: advisor Preferred Networks,
n Deep Learning tutorials / bootcamp(s) OpenAI, OffWorld, Dishcraft, TensorFlight,
n Exec / C-level lectures on recent trends and Traptic, onai, inzone.ai, livongo, …
advances in AI (2x / month)
n Verizon TV commercial
Pieter Abbeel -- UC Berkeley | Gradescope | Covariant.AI

Recent Headlines

Recent Headlines

Recent Headlines

Recent Headlines

Recent Headlines

Recent Headlines

Recent Headlines
[Macron (France), March 2018] Pieter Abbeel -- UC Berkeley | Gradescope | Covariant.AI

Some Very High Valuations

Some Very High Valuations

High Valuations in China

High Valuations in Japan

Life of Professor in AI
Before 2015 After 2015
MARS (Bezos) NeurIPS 2017: 8000 attendees

NIPS Conference September 4, 2018
NIPS sold out faster than

most concerts
Pieter Abbeel – covariant.ai | UC Berkeley | gradescope.com

10/25/2018 Sold at Christies for $432,500
Hype?
n Yes
n But also fundamental advances being made

Number of arxiv papers submitted in AI categories
[source: https://medium.com/@karpathy/a-peek-at-trends-in-machine-learning-ab8a1085a106] Pieter Abbeel -- UC Berkeley | Gradescope | Covariant.AI

Outline
n Deep learning successes
n Supervised learning
n Reinforcement learning
n Unsupervised learning
n Example applications
n Parting thoughts

Why is AI hard to build? E.g. vision?
“Coffee Mug”
Pixel Intensity
Pixel intensity is a very poor representation.
[slide from Adam Coates] Pieter Abbeel -- UC Berkeley | Gradescope | Covariant.AI

Object Detection in Computer Vision
n State-of-the-art object detection until 2012:
Support “cat”
Hand-engineered
Input Vector “dog”
features (SIFT,
Image Machine “car”
HOG, DAISY, …)
(SVM) …
n Deep Supervised Learning (Krizhevsky, Sutskever, Hinton 2012; also LeCun, Bengio, Ng, Darrell, …):
“cat”
Input 8-layer neural network with 60 million “dog”
Image parameters to learn “car”
…
n ~1.2 million training images from ImageNet [Deng, Dong, Socher, Li, Li, Fei-Fei, 2009]

Many Layer Neural Network
cat
car
dog
nothing
different weights à different computation

Neural Net Training: Find the weights that minimize the difference between labels and activation.
Neural Net Learning Image Recognition
n Training / Learning:
n Cars:
n Cats: labeled
data
n Dogs:
Machine Learning
n Test time: à à Label = Cat? Dog? Car?

Performance
graph credit Matt

Zeiler, Clarifai
Performance
graph credit Matt

Zeiler, Clarifai
Performance
AlexNet
graph credit Matt

Zeiler, Clarifai
Performance
AlexNet
graph credit Matt

Zeiler, Clarifai
Performance
AlexNet
graph credit Matt

Zeiler, Clarifai
MS COCO Image Captioning Challenge
Karpathy & Fei-Fei, 2015; Donahue et al., 2015; Xu et al, 2015; many more
Visual QA Challenge
Stanislaw Antol, Aishwarya Agrawal, Jiasen Lu, Margaret Mitchell, Dhruv Batra, C. Lawrence Zitnick, Devi Parikh

Change in Programming Paradigm!
Traditional Programming: Deep Learning (“Software 2.0”)
program by writing lines of code program by providing data
? ?
Poor performance Success!

Speech Recognition
graph credit Matt Zeiler, Clarifai

Machine Translation
Google Neural Machine Translation (in production)

History
(Olshausen, 1996)
2000s Sparse, Probabilistic, and Energy models (Hinton, Bengio, LeCun, Ng)
Is deep learning 3, 30, or 60 years old?
based on history by K. Cho Rosenblatt’s Perceptron

Why Now?
n Data
n Compute
n AI innovation

Outline
n Supervised learning = pattern recognition, if enough data (input -> output pairs),
then neural net can learn the pattern
n Parting thoughts

Deep Supervised Learning – How to get started?
n fast.ai
n deeplearning.ai (Andrew Ng)
n cs231n.stanford.edu (Andrej Karpathy et al)
n fullstackdeeplearning.com (Pieter Abbeel [Berkeley,
Covariant], Sergey Karayev [Gradescope], Josh Tobin [OpenAI],
Andrej Karpathy [Tesla], Yangqing Jia [Facebook], Jairam
Ranganathan [Uber], Lukas Biewald [W&B])

Outline
n Supervised learning = pattern recognition, if enough data (input -> output pairs),
then neural net can learn the pattern
n Parting thoughts

Outline
n Supervised learning = pattern recognition, if enough data (input -> output pairs), then
neural net can learn the pattern
n Reinforcement learning / AI with goals

n Parting thoughts

PR-1
[Wyrobek, Berger, van der Loos, Salisbury, ICRA 2008] Pieter Abbeel -- UC Berkeley | Gradescope | Covariant.AI
AI with Objectives/Goals
n Robotics
n Marketing /
perceive Advertising
n Dialogue
n Optimizing
operations /
act logistics
n Queue
management
n …

From Pixels to Actions?
Pong Enduro Beamrider Q*bert

Deep Q-Network (DQN): From Pixels to Joystick Commands
32 8x8 filters with stride 4 + ReLU

fully connected 512 units + ReLU [Source: Mnih et al., Nature 2015 (DeepMind) ]
fully connected output units, one per action Pieter Abbeel -- UC Berkeley | Gradescope | Covariant.AI
[ Source: Mnih et al., Nature 2015 (DeepMind) ]

Deep RL Success Stories
DQN Mnih et al, NIPS 2013 / Nature 2015

MCTS Guo et al, NIPS 2014; TRPO Schulman, Levine, Moritz, Jordan, Abbeel, ICML 2015; A3C Mnih et al,
ICML 2016; Dueling DQN Wang et al ICML 2016; Double DQN van Hasselt et al, AAAI 2016; Prioritized
Experience Replay Schaul et al, ICLR 2016; Bootstrapped DQN Osband et al, 2016; Q-Ensembles Chen et al,
2017; Rainbow Hessel et al, 2017; Accelerated Stooke and Abbeel, 2018; …
Deep RL Success Stories
AlphaGo Silver et al, Nature 2015

AlphaGoZero Silver et al, Nature 2017
AlphaZero Silver et al, 2017
Tian et al, 2016; Maddison et al, 2014; Clark et al, 2015
OpenAI’s 1v1 Dota [2017] and 5v5 [2018]
n Super-human agent on a competitive game, enabled by
n Self-play
n Enough computation
n Cooperation emerges

Learning Locomotion
[Schulman, Moritz, Levine, Jordan, Abbeel, ICLR 2016] Pieter Abbeel -- UC Berkeley | Gradescope | Covariant.AI
Deep RL: Learn to Pass/Protect
[Bansal et al, 2017] Pieter Abbeel -- UC Berkeley | Gradescope | Covariant.AI

Deep RL: Learn Soccer
[Bansal et al, 2017] Pieter Abbeel -- UC Berkeley | Gradescope | Covariant.AI

Deep RL: Virtual Stuntman
[Peng, Abbeel, Levine, van de Panne, 2018] Pieter Abbeel -- UC Berkeley | Gradescope | Covariant.AI
Deep RL: Dynamic Animation for Motion Picture
[Peng, Abbeel, Levine, van de Panne, 2018] Pieter Abbeel -- UC Berkeley | Gradescope | Covariant.AI
BRETT: Berkeley Robot for the Elimination of Tedious Tasks
[Levine*, Finn*, Darrell, Abbeel, JMLR 2016] Pieter Abbeel -- UC Berkeley | Gradescope | Covariant.AI
Unsupervised Learning for Interaction?
[Levine et al, 2016] Pieter Abbeel -- UC Berkeley | Gradescope | Covariant.AI

Application: Google Datacenter Cooling
40% reduction in cooling cost
https://deepmind.com/blog/deepmind-ai-reduces-google-data-centre-cooling-bill-40/
Deep Reinforcement Learning -- NASA SUPERball
[Geng*, Zhang*, Bruce*, Caluwaerts, Vespignani, Sunspiral, Abbeel, Levine, ICRA 2017] Pieter Abbeel -- UC Berkeley | Gradescope | Covariant.AI
How About a Hand?
[OpenAI Robotics Team] Pieter Abbeel -- UC Berkeley | Gradescope | Covariant.AI

Speed Up Deep RL through Imitation

Human Demonstrations
[Zhang, McCarthy, Jow, Lee, Chen, Goldberg, Abbeel, ICRA 2018] Pieter Abbeel -- UC Berkeley | Gradescope | Covariant.AI
Outline
n Reinforcement learning = learning goal-oriented behaviors from trial and error

n Parting thoughts

Want to use Deep RL yourself?
n OpenAI Gym: https://gym.openai.com
n OpenAI baselines: https://github.com/openai/baselines
n rllab: https://github.com/rll/rllab

Want to build Deep RL yourself?
n Deep RL Bootcamp [intense long weekend]
n 12 lectures by: Pieter Abbeel, Rocky Duan, Peter Chen, John Schulman,
Vlad Mnih, Chelsea Finn, Sergey Levine
n 4 hands-on labs
n https://sites.google.com/view/deep-rl-bootcamp/
n Deep RL Course [full semester]

n Originated by John Schulman, Sergey Levine, Chelsea Finn
n Latest offering (by Sergey Levine):
http://rll.berkeley.edu/deeprlcourse/
Outline

n Parting thoughts

Outline

n Parting thoughts

Generative Models
n “What I cannot create, I do not understand.” -- Richard Feynman
n Ability to generate data that look real entails some form of understanding
[Radford, Metz & Chintala, ICLR 2016]

Initial “Images”
[Salimans, Goodfellow, Zaremba, Cheung, Radford & Chen, NIPS 2016] Pieter Abbeel -- UC Berkeley | Gradescope | Covariant.AI
Learning to Generate Images
[Salimans, Goodfellow, Zaremba, Cheung, Radford & Chen, NIPS 2016] Pieter Abbeel -- UC Berkeley | Gradescope | Covariant.AI
Progressive GAN: HighRes Images
[Karras, Aila, Laine & Lehtinen, 2017] Pieter Abbeel -- UC Berkeley | Gradescope | Covariant.AI
Unsupervised Image to Image
[CycleGAN: Zhu, Park, Isola & Efros, 2017] Pieter Abbeel -- UC Berkeley | Gradescope | Covariant.AI
Everybody Dance Now
[Everybody Dance Now: Caroline Chan, Shiry Ginosar, Tinghui Zhou, Alyosha Efros (UC Berkeley) 2018] Pieter Abbeel -- UC Berkeley | Gradescope | Covariant.AI
Neural “Photoshop”
[Brock, Lim, Ritchie & Weston, 2016] Pieter Abbeel -- UC Berkeley | Gradescope | Covariant.AI
Neural Compression 2
JPEG JPEG2000 WaveOne
[Rippel & Bourdev, 2017] Pieter Abbeel -- UC Berkeley | Gradescope | Covariant.AI

Beyond Images: Amazon Review
This product does what it is Great little item. Hard to put

supposed to. I always keep on the crib without some
three of these in my kitchen kind of embellishment. My
just in case ever I need a guess is just like the screw
replacement cord. kind of attachment I had.
[Radford et al, 2017] Pieter Abbeel -- UC Berkeley | Gradescope | Covariant.AI

Outline

n Unsupervised learning = learning structure of the world w/o explicit supervision
n Parting thoughts

Outline

n Parting thoughts

Deep Supervised Learning Really Works!
n What does this give us?
n Automated prediction

Importance of Automated Prediction?

Yelp – Select Best Photos for Each Venue
BEFORE:
AFTER:

Yahoo: emoji prediction
n Emoji use is highly
dynamic / context
specific
n Ideally desired one
within top-5

Marketing
n ERIC STAHL, SVP AT SALESFORCE MARKETING CLOUD
n With Salesforce Einstein, marketers can gauge how likely it is a customer will engage
with an email, unsubscribe from an email list, or make a web purchase, and determine
what is driving true engagement to better anticipate the needs of every customer.

Customer Support
n With fast growing sales, often difficult to keep up with
customer support hiring, and especially training
n AI can help managers detect which support conversations are
going the wrong way (e.g. negative sentiment) and they can
intervene + train on the job

https://www.fda.gov/NewsEvents/Newsroom/PressAnnouncements/ucm604357.htm Pieter Abbeel -- UC Berkeley | Gradescope | Covariant.AI
Self-Driving Cars
Self-Driving Cars -- Stats

Self-Driving Cars -- Stats

Energy-Inference-Accuracy Landscape on the Squeezelator
ImageNet energy-accuracy for different

NNs
SqueezeNext vs
SqueezeNet/AlexNet
• 8% more accurate
*
• 2.25x better than SqueezeNet MobileNet
v1
• 7.5x better than AlexNet
[slide credit: Kurt Keutzer] Pieter Abbeel -- UC Berkeley | Gradescope | Covariant.AI

Existing Robotic Automation
n Robots perform programmed simple motions

Two New Waves of Automation
n Wave 1: Robots with “eyes”
n Starting to happen now
n Wave 2: Teachable Robots (“get help anywhere, anytime”)

n Anticipated 5 years from now
n (not prime-time ready, but already starting to happen in research labs…)

covariant.ai

Outline

n Parting thoughts

Why Now? And is an AI winter coming?
n Data
n Compute
n AI expertise

A (Short) History of AI
§ 1940-1950: Early days
§ 1943: McCulloch & Pitts: Boolean circuit model of brain
§ 1950: Turing's “Computing Machinery and Intelligence”
§ 1950—70: Excitement: Look, Ma, no hands!
§ 1950s: Early AI programs, including Samuel's checkers program,
Newell & Simon's Logic Theorist, Gelernter's Geometry Engine
§ 1956: Dartmouth meeting: “Artificial Intelligence” adopted
§ 1965: Robinson's complete algorithm for logical reasoning
§ 1970—90: Knowledge-based approaches
§ 1969—79: Early development of knowledge-based systems
§ 1980—88: Expert systems industry booms
§ 1988—93: Expert systems industry busts: “AI Winter”
§ 1990— 2012: Statistical approaches + subfield expertise
§ Resurgence of probability, focus on uncertainty
§ General increase in technical depth
§ Agents and learning systems… “AI Spring”?
§ 2012— ___: Excitement: Look, Ma, no hands again?

§ Big data, big compute, neural networks
§ Some re-unification of sub-fields
§ AI used in many industries
Data
[Source: domo.com] Pieter Abbeel -- UC Berkeley | Gradescope | Covariant.AI

Compute: Moore’s Law
~ 2x
every 3 years
[20-April-2018]

Compute: Neural Net Chip Development
100-1000x?
[20-April-2018]
Sidenote: training vs. inference Pieter Abbeel -- UC Berkeley | Gradescope | Covariant.AI

Compute: Ability to Compute over Many Machines
Source: OpenAI Pieter Abbeel -- UC Berkeley | Gradescope | Covariant.AI

Also, from an industry perspective
n Companies making (lots of) money from AI this time around…

Now let’s take a look at scale
Architecture Num neurons Num synapses
Fly 100K = 105 10M = 107
AlexNet 650K = 106 60M = 108
Mouse 100M = 108 100B = 1011
Human 100B = 1011 1014 -1015
If each synapse is 1 FLOP (i.e., can fire / not fire once per second),
Then human brain requires 1015 flops = 1 petaflop.
10,000 current CPUs, costs $1000 / hr on Amazon’s EC2
Or 10 latest GPUs (Nvidia V100), costs $30 / hr on EC2 (2017) -- $25 / hr on GCP (2018)
AI and the World
Living with Superintelligence(s)
n Value Alignment n Bionic AI brain add-ons
n “We had better be quite sure that the n High bandwidth
purpose put into the machine is the
purpose which we really desire”
brain-machine interfaces
Norbert Wiener, 1960
King Midas circa 540 BCE
(from: humancompatible.ai at Berkeley) Glenn Northcut, Understanding Vertebrate Brain Evolution (2002)

Want to stay updated on AI? (1/2)

Want to stay updated on AI? (2/2)
n https://jack-clark.net/import-ai/
Opportunities for Further Engagement
n Berkeley Robot Learning Lab + Berkeley AI Research Lab:
n Outreach -- rll_outreach@lists.berkeley.edu
n Industrial affiliates program -- pabbeel@cs.berkeley.edu
n Covariant.AI:
n AI solutions for your factory / warehouse / etc. -- pabbeel@covariant.ai
n Advising / Consulting / Training:

n AI input for your organization -- pabbeel@cs.berkeley.edu
n AI to simplify grading homework/exams:

n Get a (free) account at www.gradescope.com

2018 12 Abbeel - AI PDF

Încărcat de

Informații document

Titlu original

Drepturi de autor

Formate disponibile

Partajați acest document

Partajați sau inserați document

Opțiuni de partajare

Vi se pare util acest document?

Este necorespunzător acest conținut?

Drepturi de autor:

Formate disponibile

2018 12 Abbeel - AI PDF

Încărcat de

Drepturi de autor:

Formate disponibile

Deep Learning and Robotics

n Teaching n 2017 – now: founder covariant.ai

Pieter Abbeel -- UC Berkeley | Gradescope | Covariant.AI

Pieter Abbeel -- UC Berkeley | Gradescope | Covariant.AI

Pieter Abbeel -- UC Berkeley | Gradescope | Covariant.AI

Pieter Abbeel -- UC Berkeley | Gradescope | Covariant.AI

Pieter Abbeel -- UC Berkeley | Gradescope | Covariant.AI

Pieter Abbeel -- UC Berkeley | Gradescope | Covariant.AI

Pieter Abbeel -- UC Berkeley | Gradescope | Covariant.AI

[Macron (France), March 2018] Pieter Abbeel -- UC Berkeley | Gradescope | Covariant.AI

Pieter Abbeel -- UC Berkeley | Gradescope | Covariant.AI

Pieter Abbeel -- UC Berkeley | Gradescope | Covariant.AI

Pieter Abbeel -- UC Berkeley | Gradescope | Covariant.AI

Pieter Abbeel -- UC Berkeley | Gradescope | Covariant.AI

MARS (Bezos) NeurIPS 2017: 8000 attendees

Pieter Abbeel -- UC Berkeley | Gradescope | Covariant.AI

NIPS sold out faster than

Pieter Abbeel – covariant.ai | UC Berkeley | gradescope.com

Pieter Abbeel -- UC Berkeley | Gradescope | Covariant.AI

[source: https://medium.com/@karpathy/a-peek-at-trends-in-machine-learning-ab8a1085a106] Pieter Abbeel -- UC Berkeley | Gradescope | Covariant.AI

Pieter Abbeel -- UC Berkeley | Gradescope | Covariant.AI

Pixel intensity is a very poor representation.

[slide from Adam Coates] Pieter Abbeel -- UC Berkeley | Gradescope | Covariant.AI

Pieter Abbeel -- UC Berkeley | Gradescope | Covariant.AI

different weights à different computation

n Test time: à à Label = Cat? Dog? Car?

graph credit Matt

graph credit Matt

graph credit Matt

graph credit Matt

graph credit Matt

Pieter Abbeel -- UC Berkeley | Gradescope | Covariant.AI

Poor performance Success!

Pieter Abbeel -- UC Berkeley | Gradescope | Covariant.AI

graph credit Matt Zeiler, Clarifai

Pieter Abbeel -- UC Berkeley | Gradescope | Covariant.AI

Is deep learning 3, 30, or 60 years old?

based on history by K. Cho Rosenblatt’s Perceptron

Pieter Abbeel -- UC Berkeley | Gradescope | Covariant.AI

Pieter Abbeel -- UC Berkeley | Gradescope | Covariant.AI

Pieter Abbeel -- UC Berkeley | Gradescope | Covariant.AI

Pieter Abbeel -- UC Berkeley | Gradescope | Covariant.AI

n Reinforcement learning / AI with goals

Pieter Abbeel -- UC Berkeley | Gradescope | Covariant.AI

Pieter Abbeel -- UC Berkeley | Gradescope | Covariant.AI

Pong Enduro Beamrider Q*bert

Pieter Abbeel -- UC Berkeley | Gradescope | Covariant.AI

32 8x8 filters with stride 4 + ReLU

Pieter Abbeel -- UC Berkeley | Gradescope | Covariant.AI

DQN Mnih et al, NIPS 2013 / Nature 2015

AlphaGo Silver et al, Nature 2015

Pieter Abbeel -- UC Berkeley | Gradescope | Covariant.AI

[Bansal et al, 2017] Pieter Abbeel -- UC Berkeley | Gradescope | Covariant.AI

[Bansal et al, 2017] Pieter Abbeel -- UC Berkeley | Gradescope | Covariant.AI

[Levine et al, 2016] Pieter Abbeel -- UC Berkeley | Gradescope | Covariant.AI

40% reduction in cooling cost

[OpenAI Robotics Team] Pieter Abbeel -- UC Berkeley | Gradescope | Covariant.AI

Pieter Abbeel -- UC Berkeley | Gradescope | Covariant.AI

n Reinforcement learning = learning goal-oriented behaviors from trial and error

Pieter Abbeel -- UC Berkeley | Gradescope | Covariant.AI

n OpenAI baselines: https://github.com/openai/baselines

Pieter Abbeel -- UC Berkeley | Gradescope | Covariant.AI

n Deep RL Course [full semester]