Glass type recognition

Bartek Bielski
2020-04-26

The study of classification of types of glass was motivated by criminological investigation. At the scene of the crime, the glass left can be used as evidence, if it is correctly identified.

The data set was created by B. German at Central Research Establishment Home Office Forensic Science Service in UK. The application is supposed to recognize the type of glass, on basis of user input (chemical test results of a sample). Typical vector with glass sample chemical test results is: (1.51, 13.00, 3.5, 2, 72, 0.6, 8, 0, 0.1)

Dataset description (1)

The dataset consists of 214 observations, with 9 variables (unit measurement: weight percent in corresponding oxide, as are attributes 4-10) and type of glass code.

library(mlbench)
data(Glass)
summary(Glass)

       RI              Na              Mg              Al       
 Min.   :1.511   Min.   :10.73   Min.   :0.000   Min.   :0.290  
 1st Qu.:1.517   1st Qu.:12.91   1st Qu.:2.115   1st Qu.:1.190  
 Median :1.518   Median :13.30   Median :3.480   Median :1.360  
 Mean   :1.518   Mean   :13.41   Mean   :2.685   Mean   :1.445  
 3rd Qu.:1.519   3rd Qu.:13.82   3rd Qu.:3.600   3rd Qu.:1.630  
 Max.   :1.534   Max.   :17.38   Max.   :4.490   Max.   :3.500  
       Si              K                Ca               Ba       
 Min.   :69.81   Min.   :0.0000   Min.   : 5.430   Min.   :0.000  
 1st Qu.:72.28   1st Qu.:0.1225   1st Qu.: 8.240   1st Qu.:0.000  
 Median :72.79   Median :0.5550   Median : 8.600   Median :0.000  
 Mean   :72.65   Mean   :0.4971   Mean   : 8.957   Mean   :0.175  
 3rd Qu.:73.09   3rd Qu.:0.6100   3rd Qu.: 9.172   3rd Qu.:0.000  
 Max.   :75.41   Max.   :6.2100   Max.   :16.190   Max.   :3.150  
       Fe          Type  
 Min.   :0.00000   1:70  
 1st Qu.:0.00000   2:76  
 Median :0.00000   3:17  
 Mean   :0.05701   5:13  
 3rd Qu.:0.10000   6: 9  
 Max.   :0.51000   7:29

Dataset description (2)

The Types of glass: (class attribute)

building_windows_float_processed (1)
uilding_windows_non_float_processed (2)
vehicle_windows_float_processed (3)
vehicle_windows_non_float_processed (4) (none in this database)
containers (5)
tableware (6)
headlamps (7)

Mashine learning algorithm used in recognition model

As glass recognition is typical classification problem, the random forest algorithm seems to be apriopriate. For presentation purposes the code for building the model was with “light” settings. In shiny app, the model was calculated with more advanced options.

require("mlbench") # Glass dataset is there
library(mlbench)
library(caret)
set.seed(56789)
data(Glass)
trainIndex <- createDataPartition(Glass$Type, p=0.7, list=FALSE); trainData <- Glass[trainIndex,];testData <- Glass[-trainIndex,]
rf <- train(data=trainData, Type ~ ., method = "rf", metric="Accuracy")
tres <- predict(rf, newdata = testData)
model.summary <- confusionMatrix(tres, testData$Type)

Model summary

The overall accuracy is at level of 70%, with sensitivity different in different clasess waving from 50% up to 85% and specificity in range of 72% - 100%. With such properties, the modal can be treated as clue, but not as an evidence in any court case.

All files, ui.R, server.R and this presentation are in the following repository: https://github.com/bartriman/DataProducts_week4