Deep Learning

I am writing this blog post to provide an overview of deep learning (DL), explain more clearly the practicality of algorithms without diving into the math expressions behind the model. Basically, DL is a branch of machine learning (ML) based on computational models built of multiple processing layers using learning representations of data with increasing level abstractions. These data can be text, voice, image, video, etc. As for tools to perform deep learning, I only list the most popular used by the 2017 top programming languages.

On Twitter, the hashtag #DeepLearning easily allows finding some posts related to Deep Learning.

Increasingly, DL becomes indispensable because of its accuracy in solving problems in the fields of Automatic speech recognition, image recognition, natural language processing (NLP), drug discovery and toxicology, customer relationship management, recommendation systems, bioinformatics, etc.

For people willing to make of DL their cup-of-coffee, there are some prerequisites to brush up for being able to understand the skeleton of DL and how to handle function parameters. The starting point includes some basics in linear algebra, probability and information theory, numerical computation and some ML techniques. Since the development of algorithms relies on the background of these disciplines above, it’s recommended to beginners to consider them for mastering the theoretical side of DL and for being able to interpret the results.

Deep Learning Techniques

The learning algorithm can be supervised or unsupervised. The goal of any supervised learning algorithms is to build an artificial system that can learn the mapping between the input and the output, and can predict the output of the system given new inputs; and the goal of unsupervised learning is to build representations of the input that can be used for decision making, predicting future inputs, efficiently communicating the inputs to another machine, etc.

Let explain the ConNet seamlessly and make it easier for beginners (include me) referring to the Fig 1 a deep convolutional neural network for image classification.

  1. Feed-forward feature extraction
  1. Convolve input with learned filters:
  2. Non-linearity: The ReLU (Rectified Linear Unit) is preferable compared to the function tanh (hyperbolic tangent) and sigmoid because it avoids saturation issues, simplifies the backpropagation and makes learning several times faster)
  3. Spatial pooling: it does the sum of max, it partitions the input image into a set of non-overlapping rectangles and, for each such sub-region, outputs the maximum.
  4. Normalization: number of feature maps times number of pixel positions across layers before or after spatial pooling.
  1. Supervised training of convolutional filters by back-propagating classification error
  1. Elman and Jordan Networks: synchronous, fix recurrent weights, the training uses propagation
  2. Hopfield Networks: fully connected graph, asynchronous, fixed-points of dynamical system
  3. Liquid State Machines and Echo State Networks
  4. Fully recurrent network and Recursive neural networks

DL tools by programming languages

  1. Python: Caffe, Theano, TensorFlow, Keras, Lasange, etc.
  2. C
  3. Java: Encog, Deeplearning4j, etc.
  4. C++: eblearn, Intel® Deep Learning Framework, etc.
  5. C#: CNTKSharp
  6. R: h2o, darch, deepnet, TensorFlow, Keras, etc
  7. JavaScript: Convnet.js
  8. PHP
  9. Go: TensorFlow
  10. Swift: Swift-AI