Traffic signs are something that we all know of. Belgiam is a country where traffic signs are found to be in dutch and french languages. Well, it is not a matter of concern for this project as we will classify 4575 images of 62 different types of Traffic sign boards found on Belgian roads.
Belgian traffic signs have 6 major categories. And as of 31st Jan, 2017 30000 different traffic signs where reomved from belgian roads. The extreme presence of different types of signs on Belgian roads has been an overwhelming sign and has lead to discussions in Belgium as well as European union.
To provide solution to this overwhelming issue, the goal of this project is to develop a Deep Neural Network which classifies the image into their respecive category. This image classification system can be then used in automobiles which with a front camera can help alert the driver with all the intricacies that he may encounter on road.
Firstly, we use Tensorflow's inbuilt functions to develop a neural network. We then test this model on new images.
Secondly, we would try to hard code a Deep Neural Network with 6 layers and 2400 Nodes to obtain a model which will yield train accuracy of 100% and performs extremely well on the Test data.
The data for the following project has been downloaded from http://btsd.ethz.ch/shareddata
import os
import random
import time
import numpy as np
import matplotlib.pyplot as plt
from skimage import data, io, filters, transform
from skimage.color import rgb2gray
import tensorflow as tf
%matplotlib inline
def load_data(data_directory):
directories = [d for d in os.listdir(data_directory)
if os.path.isdir(os.path.join(data_directory, d))]
labels = []
images = []
for d in directories:
label_directory = os.path.join(data_directory, d)
file_names = [os.path.join(label_directory, f)
for f in os.listdir(label_directory)
if f.endswith(".ppm")]
for f in file_names:
images.append(data.imread(f))
labels.append(int(d))
return images, labels
ROOT_PATH = "F:/Belgian traffic - Deep learning/"
train_data_directory = os.path.join(ROOT_PATH, "TrafficSigns/Training")
test_data_directory = os.path.join(ROOT_PATH, "TrafficSigns/Testing")
images, labels = load_data(train_data_directory)
test_images, test_labels = load_data(test_data_directory)
plt.figure(figsize = (10, 6.5))
plt.hist(labels, 62, histtype = 'step');
In the above plot, we try to display a histogram of the labels in our training data.We can say that there are maximum number of images with labels 22 and 32.
The below graph plots the 62 classes rangeing from 0 to 61 from our data set. We find that images with class 22 and 32 are maximum as compared to other images. This is also supported by the histogram which has large spikes on numbers 22 and 32.
import pandas as pd
uni_labels = set(labels)
i = 1
plt.figure(figsize = (25,25))
for label in uni_labels:
image = images[labels.index(label)]
plt.subplot(8, 8, i)
plt.axis('off')
i += 1
plt.title('label {0} ({1})'.format(label, labels.count(label)))
plt.imshow(image)
plt.show()
plt.figure(figsize = (8,6))
ts = [321, 666, 808,3567]
for i in range(len(ts)):
plt.subplot(4,1,i+1)
plt.axis('off')
plt.imshow(images[ts[i]])
plt.show()
print("shape: {0}, min: {1}, max: {2}".format(images[ts[i]].shape,
images[ts[i]].min(),
images[ts[i]].max()))
From the above four plots, we find that
These obseravtions lead us to a conclusion that all the images should be rescalled before we feed them into our neural network as it would make the training process much easier and more accurate.
Also, the minimum and maximum values need to be tuned into a same range or need to be normalized.
We will rescale our images using the ski-image package's transform function to 50x50 pixels so that we have a uniform dimension for our images.
# Resize images
images50 = [transform.resize(image, (50, 50)) for image in images]
test_images50 = [transform.resize(image, (50, 50)) for image in test_images]
plt.subplot(1,2,1)
plt.imshow(images[256])
plt.title('Original Image')
plt.axis('off')
plt.subplot(1,2,2)
plt.imshow(images50[256])
plt.axis('off')
plt.title('Rescaled Image')
The below plot tells that all the images have been rescaled to 50x50 pixels with 3 color pannels of RGB. This way our feature extraction would help model our neural network with high accuracy.
ts = [321, 666, 808,3567]
for i in range(len(ts)):
plt.subplot(4,1,i+1)
plt.axis('off')
plt.imshow(images50[ts[i]])
plt.show()
print("shape: {0}, min: {1}, max: {2}".format(images50[ts[i]].shape,
images50[ts[i]].min(),
images50[ts[i]].max()))
After extracting our features and tuning the dimensions of our images to one uniform dimension we can say that we are ready to perform Deep learning on our data set.
We would use Tensorflow's built in functions to create a simple neural network.
The Neural network's parameters and hyperparameters are as defined in the below cell
x = tf.placeholder(dtype = tf.float32, shape = [None, 50, 50, 3])
y = tf.placeholder(dtype = tf.int32, shape = [None])
images_flat = tf.contrib.layers.flatten(x) ##similar to np.array().reshape(-1)
logits = tf.contrib.layers.fully_connected(images_flat, 62, tf.nn.relu) ## creates weights and bias matrix (hidden to user). Gives probabilities of occurance for the given class variable
loss = tf.reduce_mean(tf.nn.sparse_softmax_cross_entropy_with_logits(labels = y, logits = logits)) ## COMPUTES the loss by softmax cross entropy method. Each image can have one class and not more than that. This calculates the probability error in discrete classification
train_op = tf.train.AdamOptimizer(learning_rate=0.001).minimize(loss)
correct_pred = tf.argmax(logits, 1)
accuracy = tf.reduce_mean(tf.cast(correct_pred, tf.float32))
print("images_flat: ", images_flat)
print("logits: ", logits)
print("loss: ", loss)
print("predicted_labels: ", correct_pred)
sess = tf.Session()
sess.run(tf.global_variables_initializer())
start = time.clock()
print('Model Initialized')
print('-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-')
for i in range(601):
_, accuracy_val = sess.run([train_op, accuracy], feed_dict={x: images50, y: labels})
lossi = sess.run(loss, feed_dict={x: images50, y: labels})
if i % 100 == 0:
print("Cost after Epoch {:d} : {:g}".format(i, lossi))
predicted = np.array(sess.run(correct_pred, feed_dict={x: images50}))
train_accuracy = np.equal(predicted, labels).mean()
print('Accuracy on TRAINING data:', round(train_accuracy,3) * 100,"%")
predicted_test = sess.run([correct_pred], feed_dict={x: test_images50})
print('Accuracy on TEST data:', round(np.equal(predicted_test[0], test_labels).mean() * 100, 3), "%")
print('-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-')
end = time.clock()
print('\nTotal time Taken: {:.3f} mins'.format(round((end- start)/60, 2)))
print('-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-')
We can see that the Neural Network is not performing great as it just yield's 59.76 % accuracy on our Test data. This model can not be taken forward as the accuracy is way too low to draw any conclusions for further research
plt.figure(figsize = (15,15))
sample_test = random.sample(range(len(test_images50)), 10)
sample_image = [test_images50[i] for i in sample_test]
sample_label = [test_labels[i] for i in sample_test]
sample_predicted = [predicted_test[0][i] for i in sample_test]
for i in range(len(sample_image)):
act = sample_label[i]
pred = sample_predicted[i]
plt.subplot(5,2,i+1)
plt.axis('off')
plt.imshow(sample_image[i])
color = 'green' if act == pred else 'red'
plt.text(60,10,"Actual : {0} \nPredicted : {1}".format(act, pred), color = color)
The above graph illustrates that out of 10 randomly sampled images from the test data we where not able to classify more than 4 image.
This brings us motivation to develop a Deep Neural Network with hidden layers.
Let's hard code the model with Tensorflow.
For faster execution of the model we shall convert the RGB images into Grayscale and then feed it into our Neural Network.
images_grey = rgb2gray(np.array(images50))
test_images_grey = rgb2gray(np.array(test_images50))
##Data Augmentation part
##sess = tf.Session()
##sess.run(tf.global_variables_initializer())
##extra_train_images = sess.run(tf.image.flip_left_right(images_grey))
##new_train_images = np.vstack((images_grey, extra_train_images))
##new_train_labels = list(np.hstack((labels, labels)))
We will have 1280 neurons in the first layer and then will keep on decreasing the neurons by half in every successive layer untill we have 80 neurons in the 5th hidden layer.
n_neurons_in_h1 = 1280
n_neurons_in_h2 = 640
n_neurons_in_h3 = 320
n_neurons_in_h4 = 160
n_neurons_in_h5 = 1280
learning_rate = 0.001
n_features = 2500
n_classes = 62
X = tf.placeholder(dtype = tf.float32, shape = [None, 50, 50])
Y = tf.placeholder(dtype = tf.int32, shape = [None])
images_flat_a = tf.contrib.layers.flatten(X)
test_eval = tf.constant(test_images_grey, dtype='float32')
images_flat_test = tf.contrib.layers.flatten(test_eval)
W1 = tf.Variable(tf.truncated_normal([n_features, n_neurons_in_h1], stddev=0.1), name='weights1')
b1 = tf.Variable(tf.random_normal([n_neurons_in_h1]), name='biases1')
y1 = tf.nn.relu(tf.add(tf.matmul(images_flat_a, W1), b1), name='activationLayer1')
W2 = tf.Variable(tf.truncated_normal([n_neurons_in_h1, n_neurons_in_h2], stddev=0.1),name='weights2')
b2 = tf.Variable(tf.random_normal([n_neurons_in_h2]),name='biases2')
y2 = tf.nn.relu(tf.add(tf.matmul(y1, W2), b2),name='activationLayer2')
W3 = tf.Variable(tf.truncated_normal([n_neurons_in_h2, n_neurons_in_h3], stddev=0.1),name='weights3')
b3 = tf.Variable(tf.random_normal([n_neurons_in_h3]),name='biases3')
y3 = tf.nn.relu(tf.add(tf.matmul(y2, W3), b3),name='activationLayer3')
W4 = tf.Variable(tf.truncated_normal([n_neurons_in_h3, n_neurons_in_h4], stddev=0.1),name='weights4')
b4 = tf.Variable(tf.random_normal([n_neurons_in_h4]),name='biases4')
y4 = tf.nn.relu(tf.add(tf.matmul(y3, W4), b4),name='activationLayer4')
W5 = tf.Variable(tf.truncated_normal([n_neurons_in_h4, n_neurons_in_h5], stddev=0.1),name='weights4')
b5 = tf.Variable(tf.random_normal([n_neurons_in_h5]),name='biases4')
y5 = tf.nn.relu(tf.add(tf.matmul(y4, W5), b5),name='activationLayer4')
Wo = tf.Variable(tf.random_normal([n_neurons_in_h5, n_classes], stddev=0.1), name='weightsOut')
bo = tf.Variable(tf.random_normal([n_classes]), name='biasesOut')
y_ = tf.nn.log_softmax(tf.add(tf.matmul(y5, Wo), bo))
reg = tf.nn.l2_loss(W1) +tf.nn.l2_loss(W2) + tf.nn.l2_loss(W3) + tf.nn.l2_loss(W4) + tf.nn.l2_loss(W5) + tf.nn.l2_loss(Wo)
cross_entropy = tf.reduce_mean(tf.nn.sparse_softmax_cross_entropy_with_logits(labels = Y, logits = y_))
cross_entropy = tf.add(cross_entropy, 0.01 * reg) ##beta = 0.01
train_step = tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(cross_entropy)
correct_prediction = tf.argmax(y_, 1)
accuracy_a = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
y1_test = tf.nn.relu(tf.add(tf.matmul(images_flat_test, W1), b1), name='activationLayer1')
y2_test = tf.nn.relu(tf.add(tf.matmul(y1_test, W2), b2),name='activationLayer2')
y3_test = tf.nn.relu(tf.add(tf.matmul(y2_test, W3), b3),name='activationLayer3')
y4_test = tf.nn.relu(tf.add(tf.matmul(y3_test, W4), b4),name='activationLayer4')
y5_test = tf.nn.relu(tf.add(tf.matmul(y4_test, W5), b5),name='activationLayer4')
y_test = tf.nn.log_softmax(tf.add(tf.matmul(y5_test, Wo), bo))
sess = tf.Session()
sess.run(tf.global_variables_initializer())
start = time.clock()
print('Model Initialized')
print('-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-')
for i in range(601):
_, accuracy_val_a = sess.run([train_step, accuracy_a], feed_dict={X: images_grey, Y: labels})
lossi_a = sess.run(cross_entropy, feed_dict={X: images_grey, Y: labels})
if i % 100 == 0:
print("Cost after Epoch {:d} : {:g}".format(i, lossi_a))
train_accuracy_a = round(sess.run(tf.equal(correct_prediction, labels), feed_dict={X:images_grey}).mean() * 100, 3)
print('Accuracy on TRAINING data:', train_accuracy_a,"%")
test_acc = sess.run([tf.equal(tf.argmax(y_test.eval(session=sess), 1), test_labels)], feed_dict={X: test_images_grey})[0].mean()
print('Accuracy on TEST data:', round(test_acc * 100, 3), "%")
print('-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-')
end = time.clock()
print('\nTotal time Taken: {:.3f} mins'.format(round((end- start)/60, 2)), '\n')
print('-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-')
plt.figure(figsize = (15,15))
sample_test = random.sample(range(len(test_images50)), 10)
sample_image = [test_images50[i] for i in sample_test]
sample_label = [test_labels[i] for i in sample_test]
sample_predicted = [sess.run(tf.argmax(y_test.eval(session=sess),1))[i] for i in sample_test]
for i in range(len(sample_image)):
act = sample_label[i]
pred = sample_predicted[i]
plt.subplot(5,2,i+1)
plt.axis('off')
plt.imshow(sample_image[i])
color = 'green' if act == pred else 'red'
plt.text(60,10,"Actual : {0} \nPredicted : {1}".format(act, pred), color = color)
From the above graph, we can say that the new 6 layer deep neural network does a phenomenal job by predicting with 85% accuracy.
We obtain a model accuracy of 97.96% on the training set and 85.20% on the test set.
Data Augmentation was implemented on the above data set but resulted into more time consuming model training and did not increase the accuracy by major folds. Thus, was not taken into consideration.
Converting the images into grayscale helped train the model faster as comapared to RGB images.
Building a Deep neural network helped obtain greater accuracy.