1 Learning about neural network modeling using Google’s Tensorflow playground

Tensorflow is a programming library designed to help scientists and engineers create deep learning applications. Tensorflow playground is a web app that you can use to get a feel for the ways in which neural networks solve problems, and the ways in which they might fail. In this tutorial you’ll learn a little about how combinations of neurons, organized into layers, can help to process information and solve some (relatively) complex problems.

Before you begin, click the “Hide Toolbars” on the bottom right.

1.1 The Tensorflow playground

First, let’s go to the tensorflow playground. We need to do a little set up before we begin. Under the column labeled “Features” click on X2, so it disappears. Press the “-” button next to “2 Hidden Layers” twice. Finally, click the “Discretize output” button underneath the Output graphic. The result should look like this:

Starting state for your tensorflow playground

Now let’s explain what’s going on.

1.1.1 What is the network trying to achieve.

Neural networks learn mappings between an input and an output. For example, they could learn the mapping between the shape of each letter in the alphabet, and its pronunciation. What makes neural networks special is that they can learn extremely complex mappings, containing both regularities and exceptions. For instance, they can learn the correct sounds for English words (think about how “run” and “tonne” have similar sounds, but “wind” can have two different pronunciations in context).

We can think of this mapping as a mathematical function, similar to a function like y = 3 + 2x but much, much more complex, such as a function that maps multiple different variables on the right hand side of the equation to multiple different variables on the left hand side. Neural networks don’t necessarily learn a precise mathematical function like this, but they learn an approximation to that function, which is perhaps what people might do in real life.

1.1.2 How does the network do this?

1.1.2.1 Network structure

A neural network has an input layer of neurons, an output layer of neurons, and some connections between those neurons. The researcher first decides what the input layer should look like. For example, you might design an input layer where each neuron is activated when it is shown a particular letter. Or, you might design an input layer where each neuron is activated when it sees a particular sub-part of a letter, such as a down stroke or a horizontal bar. Does this remind you of anything in your psychology classes?

The researcher then has to decide what the output layer should look like. For example, each neuron in the output layer could represent a sound (phone or phoneme), or a subpart of that sound.

Finally, the researcher has to decide how to connect these neurons. Typically, we do this by adding extra layers of neurons in between the input layer and the output layer. These extra layers are able to learn the mapping between input and output.

1.1.2.2 Learning mechanisms

Like a human brain, the units in the network are connected to one another, such that when one unit is active, it affects the activation of other units. These connections can be positive – facilitating the firing of other neurons – or they can be negative – inhibiting the firing of other neurons.

By varying the weights of these connections – the degree to which they are positive or negative – we can train the network to learn particular mappings. For example, if you have a strong positive connection between the shape of the letter “k” and the sound “k”, and inhibitory connections from the letter “k” to all other sounds, then the network will know that seeing a “k” means that it should output a “k” sound, but no other sounds.

Importantly, we can use error-driven learning to train the network to make the right connections. If the network starts with a random set of connections, then we ca “reward it” when it makes a correct guess, and tell it to change the connections when it makes an incorrect guess. The best way to do this is through a learning mechanism called “backpropagation”.

1.2 Networks in the Playground

In the Tensorflow playground, the network is trying to learn a mapping from some data to a function. This is similar to what we do as scientists – we collect some data, and try to determine the mathematical function (i.e., the theory) that produced them. In this case, the data are dots of different colours, which can see on the far left. These dots are arranged on a 2 dimensional graph (i.e., with an x axis and a y axis), and the network wants to learn the function that produced dots with these particular colours. For example, blue dots in the centre of the graph and orange dots outside. That is our output.

1.2.1 Learning with a single unit

Press the play button on the top left, and you’ll see that nothing much happens. That’s because we are trying to learn a mapping using a single feature, with no other neurons involved, and that’s never going to work unless the function precisely corresponds to the coded feature (more on that later). But we can still explore what that single neuron is doing.

If you look in the “Features” column, you’ll see that our neuron has positive activation (in blue) when it encounters data on the right hand side of the X axis, and has negative activation (in orange) when it encounters data on the left hand side.

We can add additional features by simply clicking on each of the additional units. If you add all of them, and then press play, you can see that the network learns the mapping quickly and easily. Try to see what is the smallest set of features that allows you to learn the mapping.

1.2.2 Adding hidden layers

Hidden layers let us learn more complex mappings from a small set of simple inputs. Remove all of the complex features except X1, and add a single hidden layer, so it looks like this.

A simple 3 layer network

Press play, and you’ll see that it now does a much better job of learning – it isn’t perfect, but it is making a much more sophisticated guess about the function. The hidden layers are allowing the simple input to do a more complex mapping.

Next, add an additional X2 unit for features, press play, and see what happens. We can learn even more complex functions now, but it still isn’t good enough to learn our true function.

A simple 3 layer network with 2 inputs

Finally, we are going to add some extra units to the hidden layers. Add two extra neurons to the hidden layer, and press play. Voila! It should learn! Why do you think this is?

### Multiple hidden layers Create a network that looks like this

A simple 4 layer network with 2 inputs

We now have four neurons across two hidden layers (before we had four units in one hidden layer). Press play. Does it learn? If not, add some more units to the second hidden layer (closest to the output). Does it learn now? If not, why not?

Play around with different numbers of units in each of the hidden layers, and see how it affects learning.

1.2.3 Domain specificity

Create a network that looks like this, and then train the network on the “square” dataset. It should solve it quickly.

A simple 3 layer network with 1 crossed input

In this case, we have been able to learn a complex mapping even though our network is very simple, because we have made some very specific assumptions about the type of data that we might be learning about. Think about this might relate to some ideas you’ve learned about in your developmental psychology classes.

Now, let’s see if this same network is able to learn about the other patterns. Click on the other datasets and see if it can learn about them. Without modifying the Features used, can you create a network that can learn?

Now go back to our original network structure with X1 and X2 as the (very simple) features. What sorts of patterns can these features be used to model? From this, what can we say about the types of feature analyzers that human learners might want to have?

## Learning a complex pattern Try to create the simplest possible network that can solve the “Spiral” Dataset. What sort of features might be useful here, and what sort of hidden layers might be useful?