class: title-slide .row[ .col-7[ .title[ # MNIST Dataset ] .subtitle[ ## Exploring MNIST Dataset ] .author[ ### Laxmikant Soni <br> [Web-Site](https://laxmikants.github.io) <br> [<i class="fab fa-github"></i>](https://github.com/laxmiaknts) [<i class="fab fa-twitter"></i>](https://twitter.com/laxmikantsoni09) ] .affiliation[ ] ] .col-5[ .logo[ <!-- --> ] Slides:<br> [laxmikants.github.io/datasets/slides](https://laxmikants.github.io/datasets/slides/NewsGroupsAnalysis.html#1) Materials:<br> [github.com/laxmikants/datasets](https://github.com/laxmikants/datasets) ] ] --- class: inverse, center, middle # MNIST dataset presents the problem to construct and train an artificial neural network on thousands of images of handwritten digits so that it may successfully identify others when presented. The data that will be incorporated is the MNIST database which contains 60,000 images for training and 10,000 test images. --- class: inverse, center, middle # Getting the Dataset --- class: body # 1. Getting the dataset ### 1.1 Fetch the dataset <hr> -- .pull-left[ ```python import tensorflow as tf mnist = tf.keras.datasets.mnist (X_train, y_train), (X_test, y_test) = mnist.load_data() ``` ] -- .pull-right[ X_train shape (60000, 28, 28) y_train shape (60000,) X_test shape (10000, 28, 28) y_test shape (10000,) ... ] <hr> > `MNIST Dataset`: MNIST dataset of 60,000 small square 28×28 pixel grayscale images of handwritten single digits between 0 and 9. In order to get the dataset, we access the mnist object from the keras.datasets . Then we call the load_data function. This function automatically splits the data appropriately and returns a tuple with the training data and a tuple with the testing data. --- class: inverse, center, middle # Viewing sample images --- --- class: body # 2. Viewing sample images ### 2.1 plotting sample images <hr> -- .pull-left[ ```python plt.rcParams['figure.figsize'] = (7,7) for i in range(9): plt.subplot(3,3,i+1) num = random.randint(0, len(X_train)) plt.imshow(X_train[num], cmap='gray', interpolation='none') plt.title("Class {}".format(y_train[num])) plt.tight_layout() ``` ] -- .pull-right[  ] <hr> > `MNIST Dataset`: MNIST database which contains 60,000 images for training and 10,000 test images --- class: inverse, center, middle # Normalizing --- class: body # 3. Normalizing ### 3.1 Normalizing the dataset <hr> -- .pull-middle[ ```python X_train = tf.keras.utils.normalize(X_train, axis = 1 ) X_test = tf.keras.utils.normalize(X_test, axis = 1 ) ``` ] <hr> > `Normalizing`: In order to make the whole data easier to process, we are going to normalize it. This means that we scale down all the values so that they end up between 0 and 1. --- class: inverse, center, middle # Neural Netowork --- class: body # 4. Neural Network building ### 4.1 Create the sequential model <hr> -- .pull-middle[ ```python model = tf.keras.models.Sequential() ``` ] <hr> > `Model`: We use the models module from keras to create a new neural network. The Sequential constructor does this for us. Now we have a model, which doesn’t have any layers in it. Those have to be added manually. --- class: body # 4. Neural Network building ### 4.2 Add Flatten to the model <hr> -- .pull-middle[ ```python model.add(tf.keras.layers.Flatten( input_shape =( 28 , 28 ))) ``` ] <hr> > `Layer`: We start out by adding a so-called Flatten layer as our first layer. In order to add a layer to our model, we use the add function. Then we can choose the kind of layer that we want from the layers module. As you can see, we specified an input shape of 28x28 which represents the resolution of the images. What a flattened layer basically does is it flattens the input and makes it one dimensional. So instead of a 28x28 grid, we end up with 784 neurons lined up. --- class: body # 4. Neural Network building ### 4.3 Add Dense layers to the model <hr> -- .pull-middle[ ```python model.add(tf.keras.layers.Dense( units = 128 , activation =tf.nn.relu)) model.add(tf.keras.layers.Dense( units = 128 , activation =tf.nn.relu)) ``` ] <hr> > `Dense Layer`: In the next step we now add two Dense layers. These are our hidden layers and increase the complexity of our model. Both layers have 128 neurons each. The activation function is the ReLU function. Dense layers connect every neuron of this layer with all the neurons of the next and previous layer. It is basically just a default layer --- class: body # 4. Neural Network building ### 4.4 Add Output layer to the model <hr> -- .pull-center[ ```python model.add(tf.keras.layers.Dense( units = 10 , activation =tf.nn.softmax)) ``` ] <hr> > `Ouput Layer`: Last but not least we add an output layer. This one is also a dense layer but it only has ten neurons and a different activation function. The values of the ten neurons indicate how much our model believes that the respective number is the right classification. The first neuron is for the zero, the second for the one and so on. The activation function that we use here is the softmax function. This function scales the output values so that they all add up to one. Thus it transforms the absolute values into relative values. Every neuron then indicates how likely it is that this respective number is the result. We are dealing with percentages. --- class: inverse, center, middle # Compiling the model --- class: body # 5. Compiling the model ### 5.1 Compiling the model <hr> -- .pull-center[ ```python model.compile( optimizer = 'adam' , loss = 'sparse_categorical_crossentropy' , metrics =[ 'accuracy' ]) ``` ] <hr> > `Compiling model`: The compile() method involves specifying a loss, metrics, and an optimizer. To train a model with fit() , we need to specify a loss function, an optimizer, and optionally, some metrics to monitor. --- class: inverse, center, middle # Training and Testing --- class: body # 6. Training and Testing ### 6.1 Fitting the model <hr> -- .pull-center[ ```python model.fit(X_train, y_train, epochs = 3 ) ``` ] <hr> > `fitting`: Here we pass our x- and y-values as the training data. Then we also define the number of epochs that we want to go through. This number defines how many times our model is going to see the same data over and over again --- class: body # 6. Training and Testing ### 6.2 Evaluating the accuracy <hr> -- .pull-center[ ```python loss, accuracy = model.evaluate(X_test, y_test) print (loss) print (accuracy) ``` ] <hr> > `Evaluating`: After that we use the evaluate method and pass our testing data, to determine the accuracy and the loss. Most of the time we get an accuracy of around 95% (try it yourself). This is pretty good if you take into account that mere guessing would give us a 10% chance of being right. Our model performs quite well. --- class: body # 6. Training and Testing ### 6.3 Classifying our own digits <hr> -- .pull-center[ ```python img = cv2.imread( 'digit.png' )[:,:, 0 ] img = np.invert(np.array([img])) ``` ] <hr> > `Classifying`: In order to load our image into the script, we use the imread function of OpenCV. We specify the file name and use the index slicing at the end in order to choose just one dimension, in order to fit the format. Also we need to invert the image and convert it into a NumPy array. This is necessary because otherwise it will see the image as white on black rather than black on white. That would confuse our model. . --- class: body # 6. Training and Testing ### 6.4 Prediction <hr> -- .pull-center[ ```python prediction = model.predict(img) print ( 'Prediction: {}' .format(np.argmax(prediction))) plt.imshow(img[ 0 ]) plt.show() ``` ] <hr> > `Classifying`: Now we use the predict method to make a prediction for our image. This prediction consists of the ten activations from the output neurons. Since we need to generate a result out of that, we are going to use the argmax function. This function returns the index of the highest value. In this case this is equivalent to the digit with the highest probability or activation. We can then visualize that image with the imshow method of Matplotlib and print the prediction. --- class: inverse, center, middle # Thanks