Machine learning for steel defect classification

Goal

We will carry out an example of classification of defects in steels, using images, which for this case were obtained from the Northeastern University (NEU) site:

http://faculty.neu.edu.cn/yunhyan/NEU_surface_defect_database.html

The idea is to create a machine learning algorithm, in such way that by providing the captured images of the steel, it determines what type of failure each one has.

About images

“In the Northeastern University (NEU) surface defect database, six kinds of typical surface defects of the hot-rolled steel strip are collected, i.e., rolled-in scale (RS), patches (Pa), crazing (Cr), pitted surface (PS), inclusion (In) and scratches (Sc). The database includes 1,800 grayscale images: 300 samples each of six different kinds of typical surface defects.”

For the defect detection task, annotations xml files were provided indicating the class and location of a defect in each image.

An example of each defect is shown below with the addition of rectangles corresponding to the respective annotations:

crazing_265.jpg (Song and Yan, 2013)

inclusion_280.jpg (Song and Yan, 2013)

patches_292.jpg (Song and Yan, 2013)

pitted_surface_233.jpg (Song and Yan, 2013)

rolled-in_scale_292.jpg (Song and Yan, 2013)

scratches_292_marked.jpg (Song and Yan, 2013)

First the task called “Feature Engineering” is carried out to extract features from raw data, in this case pixel values, via data mining techniques. These features is used as inputs of classification machine learning algorithms.

Then an algorithm is implemented whose input is the result of the feature engineering stage.

Feature engineering:

The feature engineering stage plays a very important role in achieving the accuracy of our algorithm. It is about extracting and segregating information from images. Images are made up of pixels, which form a bitmap. A color image is made up of a matrix of three channels (red, green and blue), each triplet of red, green and blue elements is a single pixel. Each channel is an array of values across the top of the image and each value is between a numerical range of 0 to 255, where a value of 0 means that the point has no brightness and a value of 255 means increase the brightness for that point to maximum.

In short, it is about converting the information in the image into a numerical arrangement.

There are numerous features extraction techniques, applying filters with different kernels, such as the Gabor filter, using histograms or and using the concept of image segmentation using machine learning algorithms such as Kmeans, Watershed, Random Forest or neural networks.

By way of illustration, we will show one of the images to which a simple Naive zoom is made, so if the pixel value exceeds a threshold, it is replaced by 0, otherwise by 1.

As you can see with the naked eye, the idea is to help the classifier algorithm improve its precision, in this case by enhancing the contrast and simplifying value distributions.

In this case we will use a pretrained neural network VGG16 for feature extraction, from Keras package in Python. [https://keras.io/api/applications/vgg/], without including the 3 layers completely connected in the upper part of the network, that is, so that it does not classify, in such a way that its output serves as features for the classification stage.

Then we will use a neural network that is fully capable of classifying an image, just to obtain the input values for our next step, which will be to classify

The pre-training that the neural network has will be used, hence, it will not be trained. This concept, also known as transfer learning, involves the use of problem-trained models as the starting point for a related problem.

Below is a summary of its architecture.

Image

Classifier algorithm:

There are numerous algorithms that can be used: nearest neighbor classifier (KNN), support vector machine (SVM), etc., and a wide variety of neural networks (https://keras.io/api/applications/) In this case we will use the xgboost classifier algorithm (https://xgboost.readthedocs.io/en/latest/python/python_api.html)

It is a decision tree type algorithm, then the tree obtained from the implemented algorithm is shown, just to see how the branches are developed

The idea to combine VGG16 as a feature extractor and XGBoost as a classifier was obtained from Mr. Sreenivas Bhattiprolu on DigitalSreeni Youtube channel.

From the set of images, 180 images (30 for each category) were separated to test the accuracy of our algorithm, the rest were used for training.

In the confusion matrix, we see the real classifications on the y axis and the predicted classifications on the x axis. In the light cells are the values of the defect classes that were predicted correctly and in the dark cells the observations that were not predicted correctly.

The accuracy obtained over test images was 95.55% , the respective confusion matrix is shown below.

Acknowledgements

To Mr.Sreenivas Bhattiprolu and his DigitalSreeni Youtube channel, for their contribution to the development of data science

We thank to Kechen Song , Yunhui Yan and Northeastern University (NEU) for the work carried out in their research and for making the data obtained available.

Bibliography:

K. Song and Y. Yan (2013), “A noise robust method based on completed local binary patterns for hot-rolled steel strip surface defects,” Applied Surface Science, vol. 285, pp. 858-864, Nov. 2013. http://www.sciencedirect.com/science/article/pii/S0169433213016437

Yu He, Kechen Song, Qinggang Meng, Yunhui Yan (2020), “An End-to-end Steel Surface Defect Detection Approach via Fusing Multiple Hierarchical Features,” IEEE Transactions on Instrumentation and Measurements, 2020,69(4),1493-1504. https://ieeexplore.ieee.org/document/8709818

Hongwen Dong, Kechen Song, Yu He, Jing Xu, Yunhui Yan, Qinggang Meng (2020), “PGA-Net: Pyramid Feature Fusion and Global Context Attention Network for Automated Surface Defect Detection,” IEEE Transactions on Industrial Informatics, 2020. https://ieeexplore.ieee.org/document/8930292