Deep Learning in Action

Action!

Sources: [1],[3],[4],[5],[6]

So what is a neural network?

Biological neuron and artificial neuron

Source: [10]

Prototype of a neuron: the perceptron (Rosenblatt, 1958)

Source: [10]

Deep neural networks: introducing hidden layers

Source: [10]

Why go deep? A bit of background

 

Easy? Difficult?

  • walk
  • talk
  • play chess
  • solve matrix computations

Easy for us - difficult for computers

 

  • controlled movement
  • speech recognition
  • speech generation
  • object recognition and object localization

Representation matters

Source: [12]

Just feed the network the right features?

 

What are the correct pixel values for a “bike” feature?

  • race bike, mountain bike, e-bike?
  • pixels in the shadow may be much darker
  • what if bike is mostly obscured by rider standing in front?

Let the network pick the features

… a layer at a time

Source: [12]

So how does a network learn?

 

Just a sec - let's meet a real neural network first!

Play around in the browser:

So how DOES a neural network learn?

 

We need:

  • a way to quantify our current (e.g., classification) error
  • a way to reduce error on subsequent iterations
  • a way to propagate our improvement logic from the output layer all the way back through the network!

Quantifying error: Loss functions

 

The loss (or cost) function indicates the cost incurred from false prediction / misclassification

Probably the best-known loss function in machine learning is mean squared error:

\( \frac{1}{n} \sum_n{(\hat{y} - y)^2} \)

Most of the time, in deep learning we use cross entropy:

\( - \sum_j{t_j log(y_j)} \)

This is the negative log probability of the right answer.

Learning from errors: Gradient Descent

Source: [12]

Propagate back errors ... well: Backpropagation!

 

  • basically, just the chain rule: \( \frac{dz}{dx} = \frac{dz}{dy} \frac{dy}{dx} \)
  • chained over several layers:
Source: [14]

Forward pass and backward pass: Intuition

  • imagine output of \( f = (x + y) * z = -12 \) “wants” to get bigger
  • this could happen by \( q \) getting smaller -> \( q \) receives negative gradient \( \frac{df}{dq} = -4 \)
  • \( q \) just passes on this gradient to \( x \) and \( y \), as \( \frac{dq}{dx} = 1 \) and \( \frac{dq}{dy} = 1 \)
  • alternatively, it could happen by \( z \) getting bigger -> \( z \) receives positive gradient \( \frac{df}{dz} = 3 \)
Source: [13]

Applications by example

 

  • CNNs (Convolutional Neural Networks) for computer vision
  • RNNs (Recurrent Neural Networks) for Natural Language Processing
  • Deep Reinforcement Learning for real-life learning

Easy vs. hard, revisited

 

VISION

Why computer vision is hard

Source: [15]

Tasks in computer vision

Source: [13]

How do we identify the required features? Enter:

 

Convolutional Neural Networks

A convolutional neural network

 

Source: [13]

The convolution operation

 

(Strictly, this is cross-correlation, but it doesn't matter)

Source: [13]

Gimp demo

 

Blur: \( \begin{bmatrix}1 & 1 & 1\\1 & 1 & 1\\1 & 1 & 1\end{bmatrix} \), sharpen: \( \begin{bmatrix}0 & -1 & 0\\-1 & 5 & -1\\0 & -1 & 0\end{bmatrix} \), edge detect: \( \begin{bmatrix}0 & 1 & 0\\1 & -4 & 1\\0 & 1 & 0\end{bmatrix} \)

see: https://docs.gimp.org/en/plug-in-convmatrix.html

Easy vs. hard, revisited

 

VISION

LANGUAGE

Until now, all we've seen are static snapshots

 

How do we handle sequences

  • language: words, sentences, paragraphs…
  • all kinds of serial information: sensor data, stock prices…

?

 

Jane walked into the room. John walked in too. It was late in the day, and everyone was walking home after a long day at work. Jane said hi to ___

Source: [21]

How do we remember the past? Enter:

 

Recurrent neural networks

Hidden state

 

Sources: [22], [12]

Remembering is not enough

 

Sometimes we also need to forget!

LSTM cell state and the three gates

 

The LSTM cell state is protected by three gates, the forget, input, and output gates:

Source: [22]

RNN Example: Machine Translation

 

In translation, we have two sets of sequential data, one on the source and one on the target side!

Enter: sequence-to-sequence models

Source: [23]

Easy vs. hard, revisited

 

VISION

LANGUAGE

LIFE (SORT OF)

Life

 

Reinforcement learning

Source: [1]

Reinforcement learning: the task

 

Source: [1]

Reinforcement learning: The problem

 

If I get a reward many many actions later…

… how do I find out what concrete action I'm getting the reward for?

The quest for real intelligence

 

“Reinforcement learning + deep learning = AI” (David Silver, Google Deep Mind)

Deep Q-Learning @AlphaGo

 

Source: [29]

Deep Learning, where to go next?

 

For structured reading:

Just wanna have some cool fun?

Thanks for your attention!

Sources (1)

Sources (2)

Sources (3)

Sources (4)