Image Classification on Brain Tumor MRI Data

Margareth Devina

2021-03-13

Introduction

This RMarkdown is created to increase our knowledge and experience as we learn by building a deep learning model to classify or labeling pictures inside the data we used.

We will use Brain Tumor Classification (MRI) data from Kaggle (https://www.kaggle.com/sartajbhuvaji/brain-tumor-classification-mri). Here we already provided with train data and test data. Pictures inside the data are classified into 4: glioma tumor, meningioma tumor, no tumor, pituitary tumor.

The reason we use Brain Tumor data is because:

  • Brain tumors are vary and difficult to be identified.
  • There are a lot of abnormalities in the sizes and location of the brain tumor(s).
  • Often times in developing countries, the lack of skillful doctors and lack of knowledge about tumors makes it really challenging and time-consuming to generate reports from MRI.

Thus, a system performing detection and classification by using Deep Learning Algorithms using Convolution Neural Network (CNN) would be helpful to doctors all around the world.

Library Used

library(tidyverse)
library(imager)
library(keras)
library(caret)

Import Data dan Exploratory Data Analysis

Generally, in image classification, pictures or images which compiled as train data are already segregated and saved based on each label. For example images inside “Brain Tumor MRI data” are separated into 4 folders: glioma tumor, meningioma tumor, no tumor, pituitary tumor.

If we open the glioma_tumor folder, we won’t found a structured data or tabular but we only found images or pictures of glioma tumor MRI results. Thus, we need to extract those pictures.

First, we will extract the folder names inside train folder.

folder_list <- list.files("data/training/")

folder_list
## [1] "glioma_tumor"     "meningioma_tumor" "no_tumor"         "pituitary_tumor"

Then we will combine each folder name, as shown above, with the train folder directory (“data/training”).

folder_path <- paste0("data/training/", folder_list, "/")

folder_path
## [1] "data/training/glioma_tumor/"     "data/training/meningioma_tumor/"
## [3] "data/training/no_tumor/"         "data/training/pituitary_tumor/"

We use map() function to collect file name from each folder (glioma tumor, meningioma tumor, no tumor, and pituitary tumor) repetitively and because map() function will create a list, we use unlist() so all file name collected into 1 bucket and can be saved into file_name object.

# Get file name
file_name <- map(folder_path, 
                 function(x) paste0(x, list.files(x))
                 ) %>% 
  unlist()

Let’s see the first 6 rows of file_name object.

# first 6 file name
head(file_name)
## [1] "data/training/glioma_tumor/gg (1).jpg"  
## [2] "data/training/glioma_tumor/gg (10).jpg" 
## [3] "data/training/glioma_tumor/gg (100).jpg"
## [4] "data/training/glioma_tumor/gg (101).jpg"
## [5] "data/training/glioma_tumor/gg (102).jpg"
## [6] "data/training/glioma_tumor/gg (103).jpg"

We will also check the last 6 rows of file_name object.

# last 6 file name
tail(file_name)
## [1] "data/training/pituitary_tumor/p (94).jpg"
## [2] "data/training/pituitary_tumor/p (95).jpg"
## [3] "data/training/pituitary_tumor/p (96).jpg"
## [4] "data/training/pituitary_tumor/p (97).jpg"
## [5] "data/training/pituitary_tumor/p (98).jpg"
## [6] "data/training/pituitary_tumor/p (99).jpg"

Then, we can check the total image that we will used.

length(file_name)
## [1] 3303

To see the inside of file_name object, we can use load.image() function from imager package. Now, let’s see 6 samples in random from the data saved in the file_name object.

# Randomly select image
set.seed(99)
sample_image <- sample(file_name, 6)

# Load image into R
img <- map(sample_image, load.image)

# Plot image
par(mfrow = c(2, 3)) # Create 2 x 3 image grid
map(img, plot)

## [[1]]
## Image. Width: 512 pix Height: 512 pix Depth: 1 Colour channels: 3 
## 
## [[2]]
## Image. Width: 224 pix Height: 262 pix Depth: 1 Colour channels: 3 
## 
## [[3]]
## Image. Width: 800 pix Height: 693 pix Depth: 1 Colour channels: 3 
## 
## [[4]]
## Image. Width: 512 pix Height: 512 pix Depth: 1 Colour channels: 3 
## 
## [[5]]
## Image. Width: 512 pix Height: 512 pix Depth: 1 Colour channels: 3 
## 
## [[6]]
## Image. Width: 512 pix Height: 512 pix Depth: 1 Colour channels: 3

Check Image Dimension

To create a good deep learning model, we need to know the image dimension distribution from each image that we will used as the based for model creation. We can take a look the first image description.

# Full Image Description
img <- load.image(file_name[1])
img
## Image. Width: 512 pix Height: 512 pix Depth: 1 Colour channels: 3

From the chunk above, we can get the image’s width and height in pixel. Colour channels identify the image colour format, if the images in grayscale then the colour channels: 1 and if its in RGB / Red Green Blue then the colour channels: 3.

To collect the dimension only, we can use dim() function. This function will return the width, height, depth, and channel values.

# Image Dimension
dim(img)
## [1] 512 512   1   3

In below chunk, we will create a function to collect the image’s width and height then convert them into a dataframe.

get_dim <- function(x){
  img <- load.image(x) 
  
  df_img <- data.frame(height = height(img),
                       width = width(img),
                       filename = x
                       )
  
  return(df_img)
}

get_dim(file_name[1])
##   height width                              filename
## 1    512   512 data/training/glioma_tumor/gg (1).jpg

Using the above function, we can analyze the dimension of 1000 samples from the file_name object.

# Randomly get 1000 sample images
set.seed(123)
sample_file <- sample(file_name, 1000)

# Run the get_dim() function for each image
file_dim <- map_df(sample_file, get_dim)

head(file_dim, 10)
##    height width                                     filename
## 1     236   236         data/training/no_tumor/image(87).jpg
## 2     512   512     data/training/pituitary_tumor/p (13).jpg
## 3     275   220 data/training/no_tumor/image(238) - Copy.jpg
## 4     512   512      data/training/glioma_tumor/gg (572).jpg
## 5     512   512      data/training/glioma_tumor/gg (274).jpg
## 6     512   512    data/training/pituitary_tumor/p (558).jpg
## 7     236   236 data/training/no_tumor/image (52) - Copy.jpg
## 8     512   512    data/training/meningioma_tumor/m1(23).jpg
## 9     512   512  data/training/meningioma_tumor/m2 (123).jpg
## 10    512   512  data/training/meningioma_tumor/m2 (137).jpg
summary(file_dim)
##      height           width          filename        
##  Min.   : 198.0   Min.   : 200.0   Length:1000       
##  1st Qu.: 512.0   1st Qu.: 512.0   Class :character  
##  Median : 512.0   Median : 512.0   Mode  :character  
##  Mean   : 471.7   Mean   : 470.8                     
##  3rd Qu.: 512.0   3rd Qu.: 512.0                     
##  Max.   :1446.0   Max.   :1375.0

From 1000 samples in file_name object, we can see dimension variance from several images. Some image has height and width around 198 pixels up to 1446 pixels. Understanding the dimension variance, can be helpful when we are in pre-processing stage because we need to set the height and width dimensions. Each height and width dimension must be in the same size so we can train the model much better.

Data Pre-processing

We will do a data augmentation as the pre-processing process.

Data Augmentation

To make the height and width dimensions of each image into the same size, we can transform all images to a certain size, for example 64 x 64 pixels (height = 64 pixel and width = 64 pixel). What needs to be understood is that the larger the dimensions of height and width we choose, the more information we retain and the longer it will take to train our model. Of course the smaller the dimensions of height and width we choose, the more we will lose the required information from our data but the time required to train the model will be faster.

The dimensions of the height and width that will be used are 200 x 200 pixels. Do not forget to set the batch size as well so that after each batch of the model training, the model will be updated immediately. Keep in mind, the smaller the batch size, the longer the training process (because there are more model optimization processes) but it can prevent too large computations at one time. For this time, we are going to set the batch size into 100.

# Desired height and width of images
target_size <- c(200, 200)

# Batch size for training the model
batch_size <- 100

Since the number of images in the train dataset is not too many, we will create artificial data using the image augmentation method. Image augmentation can increase the size of the train dataset without the need for additional images. The purpose of using image augmentation is to teach the model not only to use the original image but also to study the modified images. Image augmentation modification results can be in the form of flipped image, rotated image, enlarged / reduced (zoom in / zoom out image), cropped image, and etc. By studying the original and modified data, our model will become better and improved.

We can do an image augmentation by using image data generator from keras library and we will apply following attributes to the image data generator:

  • Scaling the pixel value by dividing the pixel value by 255
  • Flip the image horizontally
  • Flip the image vertically
  • Rotate the image from 0 to 45 degrees
  • Zoom in or zoom out by 25% (zoom 75% or 125%)
  • Use 20% of the data as validation dataset (to test the model which trained using 80% of the data)
# Image Generator
train_data_gen <- image_data_generator(rescale = 1/255, # Scaling pixel value
                                       horizontal_flip = T, # Flip image horizontally
                                       vertical_flip = T, # Flip image vertically 
                                       rotation_range = 45, # Rotate image from 0 to 45 degrees
                                       zoom_range = 0.25, # Zoom in or zoom out range
                                       validation_split = 0.2 # 20% data as validation data
                                       )

After setting-up the image data generator in previous chunk, we can insert our image data into the generator using the flow_images_from_directory() function. Due to the save location for the data is located inside the training folder which saved inside the data folder, the directory must be set-up into: data/training/. We will use RGB color mode, target size and batch size are following the value in previous chunk, random seed = 123. Thus, from this process, we will get the augmented image both for training data and the validation data.

# Training Dataset
train_image_array_gen <- flow_images_from_directory(directory = "data/training/", # Folder of the data
                                                    target_size = target_size, # target of the image dimension (120 x 120)  
                                                    color_mode = "rgb", # use RGB color
                                                    batch_size = batch_size , 
                                                    seed = 123,  # set random seed
                                                    subset = "training", # declare that this is for training data
                                                    generator = train_data_gen
                                                    )

# Validation Dataset
val_image_array_gen <- flow_images_from_directory(directory = "data/training/",
                                                  target_size = target_size, 
                                                  color_mode = "rgb", 
                                                  batch_size = batch_size ,
                                                  seed = 123,
                                                  subset = "validation", # declare that this is the validation data
                                                  generator = train_data_gen
                                                  )

Then, using the chunk below we can get following information:

  • Number of training samples, saved into train_samples object.
  • Number of validation samples, saved into valid_samples object.
  • Number of target classes/categories, saved into output_n object.
  • Target variable class proportion.
# Number of training samples
train_samples <- train_image_array_gen$n

# Number of validation samples
valid_samples <- val_image_array_gen$n

# Number of target classes/categories
output_n <- n_distinct(train_image_array_gen$classes)

# Get the class proportion
table("\nFrequency" = factor(train_image_array_gen$classes)
      ) %>% 
  prop.table()
## 
## Frequency
##         0         1         2         3 
## 0.2500000 0.2488654 0.2507564 0.2503782
# glioma_tumor(0)
# meningioma_tumor(1)
# no_tumor(2)
# pituitary_tumor(3)

The above index represents target variable’s label or class which ordered alphabetically (0 = glioma_tumor, 1 = meningioma_tumor, 2 = no_tumor, 3 = pituitary_tumor).

Based on the proportion, each category’s proportion are quite balanced. Thus, accuracy metric in confusionMatrix() can be used to evaluate our model. Moreover, since we want the model to be able to precisely predict the result we will also use pos pred value as the second metric in confusionMatrix().

Model Architecture

To create the model, we will use Convolutional Neural Network (CNN) or Convolutional Layer. The benefit of using image as a 2D array is that we can extract certain features from the image such as the shape of nose, the shape of eyes, hand, the tumor location and size, etc.

We will build a simple model first with the following layer:

  • 1 convolutional layer to extract features from 2D image with relu activation function.
  • 1 max Pooling layer to downsample the image features.
  • 1 flattening layer to flatten data from 2D array to 1D array so it can be processed further by the dense layers then by the output layer.
  • 1 dense layer to capture more information from the flattening layer.
  • 1 dense layer for output with softmax activation function.

Don’t forget to set the input size in the first layer. If the input image is in RGB, set the colour channel into 3. If the input image is in grayscale, set the number into 1.

# input shape of the image
c(target_size, 3) 
## [1] 200 200   3
# Set Initial Random Weight
initializer <- initializer_random_normal(seed = 100)

model <- keras_model_sequential(name = "simple_model") %>% 
  
  # Convolution Layer
  layer_conv_2d(filters = 16,
                kernel_size = c(3,3),
                padding = "same",
                activation = "relu",
                kernel_initializer = initializer,
                bias_initializer = initializer,
                input_shape = c(target_size, 3) 
                ) %>% 

  # Max Pooling Layer
  layer_max_pooling_2d(pool_size = c(2,2)) %>% 
  
  # Flattening Layer
  layer_flatten() %>% 
  
  # Dense Layer
  layer_dense(units = 16,
              activation = "relu",
              kernel_initializer = initializer,
              bias_initializer = initializer) %>% 
  
  # Output Layer
  layer_dense(units = output_n,
              activation = "softmax",
              name = "Output",
              kernel_initializer = initializer,
              bias_initializer = initializer)
  
model
## Model
## Model: "simple_model"
## ________________________________________________________________________________
## Layer (type)                        Output Shape                    Param #     
## ================================================================================
## conv2d (Conv2D)                     (None, 200, 200, 16)            448         
## ________________________________________________________________________________
## max_pooling2d (MaxPooling2D)        (None, 100, 100, 16)            0           
## ________________________________________________________________________________
## flatten (Flatten)                   (None, 160000)                  0           
## ________________________________________________________________________________
## dense (Dense)                       (None, 16)                      2560016     
## ________________________________________________________________________________
## Output (Dense)                      (None, 4)                       68          
## ================================================================================
## Total params: 2,560,532
## Trainable params: 2,560,532
## Non-trainable params: 0
## ________________________________________________________________________________

As you can see, we start by entering image data with 200 x 200 pixels into the convolutional layer, which has 16 filters and 3 x 3 kernel to extract features from the image that we use as the input. Then, we downsample or only take the maximum value for each 2 x 2 pooling area so the data now only has 100 x 100 pixels with 16 filters. After that, we flatten the 2D array into a 1D array with 160,000 nodes (100 x 100 x 16 = 160000). Then, we can extract more information using a dense layer and finished by flowing the information into the output layer, which will be transformed using the softmax activation function to get the probability of each class or category as the output.

Model Fitting

After we set up the model architecture, we can compile the model by specifying the loss function and the optimizer before we fitting the data into the model. For multiclass classification, we will use categorical cross-entropy as the loss function. Then, we use adam optimizer with learning rate of 0.005. We also set epochs = 20. The model will be also evaluated using the validation data from the generator.

model %>% 
  compile(
    loss = "categorical_crossentropy",
    optimizer = optimizer_adam(lr = 0.005),
    metrics = "accuracy"
  )

# Fit data into model

history <- model %>% 
  fit_generator(
  # training data
  train_image_array_gen,

  # training epochs
  steps_per_epoch = as.integer(train_samples / batch_size), 
  epochs = 20, 
  
  # validation data
  validation_data = val_image_array_gen,
  validation_steps = as.integer(valid_samples / batch_size)
)

plot(history)

Model Evaluation

Now we will further evaluate and acquire the confusion matrix using the validation data from the generator. First, we need to acquire the file name (filename) of the image that is used as the data validation. From the filename, we will extract the categorical label as the actual label or category of the target variable.

val_data <- data.frame(file_name = paste0("data/training/", val_image_array_gen$filenames)) %>% 
  mutate(class = str_extract(file_name, "glioma_tumor|meningioma_tumor|no_tumor|pituitary_tumor"))

head(val_data, 10)
##                                   file_name        class
## 1    data/training/glioma_tumor\\gg (1).jpg glioma_tumor
## 2   data/training/glioma_tumor\\gg (10).jpg glioma_tumor
## 3  data/training/glioma_tumor\\gg (100).jpg glioma_tumor
## 4  data/training/glioma_tumor\\gg (101).jpg glioma_tumor
## 5  data/training/glioma_tumor\\gg (102).jpg glioma_tumor
## 6  data/training/glioma_tumor\\gg (103).jpg glioma_tumor
## 7  data/training/glioma_tumor\\gg (104).jpg glioma_tumor
## 8  data/training/glioma_tumor\\gg (105).jpg glioma_tumor
## 9  data/training/glioma_tumor\\gg (106).jpg glioma_tumor
## 10 data/training/glioma_tumor\\gg (107).jpg glioma_tumor

Now we will get the image into R by converting the image into an array. Since our input dimension for CNN model is image with 200 x 200 pixels with 3 color channels (RGB), we will also convert the image inside validation data with the same dimension and color channel. The reason we manually convert the image by using the array function rather than using the image generator because we want to do a prediction based on the original validation data from the training folder. If we use image generator, validation data from the training folder will be transformed thus it won’t reflect the original image or source.

# Function to convert image to array
image_prep <- function(x) {
  arrays <- lapply(x, function(path) {
    img <- image_load(path, target_size = target_size, 
                      grayscale = F # Set FALSE if image is RGB
                      )
    
    x <- image_to_array(img)
    x <- array_reshape(x, c(1, dim(x)))
    x <- x/255 # rescale image pixel
  })
  do.call(abind::abind, c(arrays, list(along = 1)))
}
test_x <- image_prep(val_data$file_name)

# Check dimension of testing data set
dim(test_x)
## [1] 659 200 200   3

From the above chunk, we can found that the validation data consists of 659 images with dimensions of 200 x 200 pixels and 3 color channels (RGB). After we have prepared the data test or the validation data, we now can proceed to predict the label of each image using the CNN model we made before.

pred_test <- predict_classes(model, test_x) 

head(pred_test, 10)
##  [1] 0 0 0 0 0 0 0 0 0 0

For easier interpretation, we can convert the encoding into class or category glioma_tumor, meningioma_tumor, no_tumor, pituitary_tumor.

# Convert encoding to label
decode <- function(x){
  case_when(x == 0 ~ "glioma_tumor",
            x == 1 ~ "meningioma_tumor",
            x == 2 ~ "no_tumor",
            x == 3 ~ "pituitary_tumor"
            )
}

pred_test <- sapply(pred_test, decode) 

head(pred_test, 10)
##  [1] "glioma_tumor" "glioma_tumor" "glioma_tumor" "glioma_tumor" "glioma_tumor"
##  [6] "glioma_tumor" "glioma_tumor" "glioma_tumor" "glioma_tumor" "glioma_tumor"

Then, we can evaluate our model using confusionMatrix() function.

confusionMatrix(as.factor(pred_test), 
                as.factor(val_data$class)
                )

history$metrics$accuracy[20]

Based on confusion matrix evaluation result, we found that the validation data accuracy amounting to 54% while the training data accuracy amounting to 67% which indicate the model is a bit overfit. As we want the model to predict precisely too, we will evaluate the model with the pos pred value (predicition) metric. On this metric, it’s obvious that the precision for no_tumor class is too high compare to other class, around 82%, while other class around 35% - 62%. Considering the accuracy and prediction values, we should tune our model to get a better accuracy and prediction values.

Tuning the Model

Model Architecture

If we look back at our first model architecture, actually we can extract more information while the data is still in an 2D image array. Only one CNN layer tasked to extract the general features of our image and then the features were downsampled using the max pooling layer. Even after pooling, we still have 100 x 100 array that has a lot of information to be extracted before flattening the data. Therefore, we can stacks more CNN layers into the model so there will be more information to be captured by the tuned model.

model
## Model
## Model: "simple_model"
## ________________________________________________________________________________
## Layer (type)                        Output Shape                    Param #     
## ================================================================================
## conv2d (Conv2D)                     (None, 200, 200, 16)            448         
## ________________________________________________________________________________
## max_pooling2d (MaxPooling2D)        (None, 100, 100, 16)            0           
## ________________________________________________________________________________
## flatten (Flatten)                   (None, 160000)                  0           
## ________________________________________________________________________________
## dense (Dense)                       (None, 16)                      2560016     
## ________________________________________________________________________________
## Output (Dense)                      (None, 4)                       68          
## ================================================================================
## Total params: 2,560,532
## Trainable params: 2,560,532
## Non-trainable params: 0
## ________________________________________________________________________________

The following is our improved model architecture (model_big):

  • 1st Convolutional layer to extract features from 2D image with relu activation function, filter = 16, kernel size = 5 x 5.
  • Max pooling layer with pool size 2 x 2.
  • 2nd Convolutional layer to extract features from 2D image with relu activation function, filter = 32, kernel size = 5 x 5.
  • Max pooling layer with pool size 2 x 2.
  • 3rd Convolutional layer to extract features from 2D image with relu activation function, filter = 32, kernel size = 5 x 5.
  • Max pooling layer with pool size 2 x 2.
  • 4th Convolutional layer to extract features from 2D image with relu activation function, filter = 64, kernel size = 5 x 5.
  • Max pooling layer with pool size 2 x 2.
  • 5th Convolutional layer to extract features from 2D image with relu activation function, filter = 128, kernel size = 3 x 3.
  • Max pooling layer with pool size 2 x 2.
  • 6th Convolutional layer to extract features from 2D image with relu activation function, filter = 256, kernel size = 3 x 3.
  • Max pooling layer with pool size 2 x 2.
  • 1 Flattening layer to flatten data from 2D array to 1D array so it can be processed further by the dense layers then by the output layer.
  • 1 Dense layer to capture more information from the flattening layer, with relu activation function, nodes = 64.
  • 1 Dense layer to capture more information from the previous dense layer, with relu activation function, nodes = 128.
  • 1 Dense layer to capture more information from the previous dense layer, with relu activation function, nodes = 256.
  • 1 Dense layer for output with softmax activation function.
initializer <- initializer_random_normal(seed = 123)

model_big <- keras_model_sequential() %>%

  # First Convolution Layer
  layer_conv_2d(filters = 16,
                kernel_size = c(5,5),
                padding = "same",
                activation = "relu",
                kernel_initializer = initializer,
                bias_initializer = initializer,
                input_shape = c(target_size, 3)
                ) %>%

  # Max Pooling Layer
  layer_max_pooling_2d(pool_size = c(2,2)) %>%

  # Second convolutional layer
  layer_conv_2d(filters = 32,
                kernel_size = c(5,5),
                padding = "same",
                activation = "relu",
                kernel_initializer = initializer,
                bias_initializer = initializer
                ) %>%

  # Max pooling layer
  layer_max_pooling_2d(pool_size = c(2,2)) %>%

  # Third convolutional layer
  layer_conv_2d(filters = 32,
                kernel_size = c(5,5),
                padding = "same",
                activation = "relu",
                kernel_initializer = initializer,
                bias_initializer = initializer
                ) %>%

  # Max pooling layer
  layer_max_pooling_2d(pool_size = c(2,2)) %>%

  # Fourth convolutional layer
  layer_conv_2d(filters = 64,
                kernel_size = c(5,5),
                padding = "same",
                activation = "relu",
                kernel_initializer = initializer,
                bias_initializer = initializer
                ) %>%

  # Max pooling layer
  layer_max_pooling_2d(pool_size = c(2,2)) %>%

  # Fifth convolutional layer
  layer_conv_2d(filters = 128,
                kernel_size = c(3,3),
                padding = "same",
                activation = "relu",
                kernel_initializer = initializer,
                bias_initializer = initializer
                ) %>%

  # Max pooling layer
  layer_max_pooling_2d(pool_size = c(2,2)) %>%

  # Sixth convolutional layer
  layer_conv_2d(filters = 256,
                kernel_size = c(3,3),
                padding = "same",
                activation = "relu",
                kernel_initializer = initializer,
                bias_initializer = initializer
                ) %>%

  # Max pooling layer
  layer_max_pooling_2d(pool_size = c(2,2)) %>%

  # Flattening Layer
  layer_flatten() %>%

  # Dense Layer
  layer_dense(units = 64,
              activation = "relu",
              kernel_initializer = initializer,
              bias_initializer = initializer) %>%
  
  # Dense Layer
  layer_dense(units = 128,
              activation = "relu",
              kernel_initializer = initializer,
              bias_initializer = initializer) %>%
  
  # Dense Layer
  layer_dense(units = 256,
              activation = "relu",
              kernel_initializer = initializer,
              bias_initializer = initializer) %>%

  # Output Layer
  layer_dense(units = output_n,
              activation = "softmax",
              name = "Output",
              kernel_initializer = initializer,
              bias_initializer = initializer)

model_big 
## Model
## Model: "sequential"
## ________________________________________________________________________________
## Layer (type)                        Output Shape                    Param #     
## ================================================================================
## conv2d_6 (Conv2D)                   (None, 200, 200, 16)            1216        
## ________________________________________________________________________________
## max_pooling2d_6 (MaxPooling2D)      (None, 100, 100, 16)            0           
## ________________________________________________________________________________
## conv2d_5 (Conv2D)                   (None, 100, 100, 32)            12832       
## ________________________________________________________________________________
## max_pooling2d_5 (MaxPooling2D)      (None, 50, 50, 32)              0           
## ________________________________________________________________________________
## conv2d_4 (Conv2D)                   (None, 50, 50, 32)              25632       
## ________________________________________________________________________________
## max_pooling2d_4 (MaxPooling2D)      (None, 25, 25, 32)              0           
## ________________________________________________________________________________
## conv2d_3 (Conv2D)                   (None, 25, 25, 64)              51264       
## ________________________________________________________________________________
## max_pooling2d_3 (MaxPooling2D)      (None, 12, 12, 64)              0           
## ________________________________________________________________________________
## conv2d_2 (Conv2D)                   (None, 12, 12, 128)             73856       
## ________________________________________________________________________________
## max_pooling2d_2 (MaxPooling2D)      (None, 6, 6, 128)               0           
## ________________________________________________________________________________
## conv2d_1 (Conv2D)                   (None, 6, 6, 256)               295168      
## ________________________________________________________________________________
## max_pooling2d_1 (MaxPooling2D)      (None, 3, 3, 256)               0           
## ________________________________________________________________________________
## flatten_1 (Flatten)                 (None, 2304)                    0           
## ________________________________________________________________________________
## dense_3 (Dense)                     (None, 64)                      147520      
## ________________________________________________________________________________
## dense_2 (Dense)                     (None, 128)                     8320        
## ________________________________________________________________________________
## dense_1 (Dense)                     (None, 256)                     33024       
## ________________________________________________________________________________
## Output (Dense)                      (None, 4)                       1028        
## ================================================================================
## Total params: 649,860
## Trainable params: 649,860
## Non-trainable params: 0
## ________________________________________________________________________________

Model Fitting

After we set up the model architecture, we can compile the model by specifying the loss function and the optimizer before we fitting the data into the model. For multiclass classification, we will use categorical cross-entropy as the loss function. Then, we use adam optimizer with learning rate of 0.001 (before we set learning rate = 0.005). We also set epochs = 60 (before we set epochs = 20). The model will be also evaluated using the validation data from the generator.

model_big %>%
  compile(
    loss = "categorical_crossentropy",
    optimizer = optimizer_adam(lr = 0.001),
    metrics = "accuracy"
  )

history_big <- model_big %>%
  fit_generator(
  # training data
  train_image_array_gen,

  # epochs
  steps_per_epoch = as.integer(train_samples / batch_size),
  epochs = 60,

  # validation data
  validation_data = val_image_array_gen,
  validation_steps = as.integer(valid_samples / batch_size),

  # print progress but don't create graphic
  verbose = 1,
  view_metrics = 0
)

plot(history_big)

Model Evaluation

Now we will evaluate the data, the tuned model, and the confusion matrix for the validation data.

pred_test_big <- predict_classes(model_big, test_x)
pred_test_big <- sapply(pred_test_big, decode)
head(pred_test_big)
## [1] "no_tumor" "no_tumor" "no_tumor" "no_tumor" "no_tumor" "no_tumor"
confusionMatrix(as.factor(pred_test_big), 
                as.factor(val_data$class)
                )

history_big$metrics$accuracy[60]

Based on confusion matrix evaluation result, we found that the validation data accuracy amounting to 90% while the training data accuracy amounting to 97% which indicate the model is quite fit since the accuracy discrepancy is not too large. Further, We will also evaluate the model through the pos pred value (prediction) metric. On this metric, it’s obvious that the precision for all class already good enough, all passed 70%. Thus, considering the accuracy and prediction values, we can say that the tuned model (model_big) already improved a lot compare to the previous one (model).

Predict Data in Testing Dataset

After we have trained the model and we already satisfied with the model performance on the validation dataset, we can try to evaluate the model using the testing dataset provided in the website in which we get the training dataset.

The testing data is separated and saved into each class folder inside the testing2 folder.

Evaluating the model with meningioma_tumor testing dataset

To extract the images from the testing folder, we will do the same approach as when we are extracting the training dataset.

meningioma_tumor_folder_path <- paste0("data/testing2/meningioma_tumor/")

meningioma_tumor_folder_path
## [1] "data/testing2/meningioma_tumor/"
# Get file name
meningioma_tumor_file_name <- map(meningioma_tumor_folder_path, 
                 function(x) paste0(x, list.files(x))
                 ) %>% 
  unlist()

# first 6 file name
head(meningioma_tumor_file_name)
## [1] "data/testing2/meningioma_tumor/image(1).jpg"  
## [2] "data/testing2/meningioma_tumor/image(10).jpg" 
## [3] "data/testing2/meningioma_tumor/image(100).jpg"
## [4] "data/testing2/meningioma_tumor/image(102).jpg"
## [5] "data/testing2/meningioma_tumor/image(106).jpg"
## [6] "data/testing2/meningioma_tumor/image(107).jpg"
# last 6 file name
tail(meningioma_tumor_file_name)
## [1] "data/testing2/meningioma_tumor/image(95).jpg"
## [2] "data/testing2/meningioma_tumor/image(96).jpg"
## [3] "data/testing2/meningioma_tumor/image(97).jpg"
## [4] "data/testing2/meningioma_tumor/image(98).jpg"
## [5] "data/testing2/meningioma_tumor/image(99).jpg"
## [6] "data/testing2/meningioma_tumor/image.jpg"
length(meningioma_tumor_file_name)
## [1] 115

Then we convert testing dataset images into 2D array.

meningioma_tumor_y <- image_prep(meningioma_tumor_file_name)

dim(meningioma_tumor_y)
## [1] 115 200 200   3

meningioma_tumor testing data consist of 115 images with 200 x 200 dimension and color channel 3 (RGB).

Since we already prepared the testing dataset, we can predict the label for each image using the model_big because this model has a higher accuracy and prediction values. Then we summarize the frequency of each predicted label to check its result accuracy.

# Predict label on array
pred_test_meningioma_tumor <- predict_classes(model_big, meningioma_tumor_y)

# Convert encoding to label
decode <- function(x){
  case_when(x == 0 ~ "glioma_tumor",
            x == 1 ~ "meningioma_tumor",
            x == 2 ~ "no_tumor",
            x == 3 ~ "pituitary_tumor"
            )
}

# Create data submission
meningioma_tumor_result <- data.frame(id = meningioma_tumor_file_name,
                         label = sapply(pred_test_meningioma_tumor, decode)
                         ) %>%
  mutate(id = str_remove(id, "data/testing2/meningioma_tumor/")) # remove file path and only keep the file name

# check first 3 data
head(meningioma_tumor_result, 10)

meningioma_tumor_result %>% 
  group_by(label) %>% 
  summarise(freq = n())

From the evaluation and summarized prediction result, model_big can predict the label or class quite well. 109 images from the total of 115 images (around 95% from the total test images) already predicted correctly as meningioma_tumor.

Evaluating the model with glioma_tumor testing dataset

glioma_tumor_folder_path <- paste0("data/testing2/glioma_tumor/")

glioma_tumor_folder_path
## [1] "data/testing2/glioma_tumor/"
# Get file name
glioma_tumor_file_name <- map(glioma_tumor_folder_path, 
                 function(x) paste0(x, list.files(x))
                 ) %>% 
  unlist()

# first 6 file name
head(glioma_tumor_file_name)
## [1] "data/testing2/glioma_tumor/image (1).jpg"  
## [2] "data/testing2/glioma_tumor/image (10).jpg" 
## [3] "data/testing2/glioma_tumor/image (100).jpg"
## [4] "data/testing2/glioma_tumor/image (11).jpg" 
## [5] "data/testing2/glioma_tumor/image (12).jpg" 
## [6] "data/testing2/glioma_tumor/image (13).jpg"
# last 6 file name
tail(glioma_tumor_file_name)
## [1] "data/testing2/glioma_tumor/image (94).jpg"
## [2] "data/testing2/glioma_tumor/image (95).jpg"
## [3] "data/testing2/glioma_tumor/image (96).jpg"
## [4] "data/testing2/glioma_tumor/image (97).jpg"
## [5] "data/testing2/glioma_tumor/image (98).jpg"
## [6] "data/testing2/glioma_tumor/image (99).jpg"
length(glioma_tumor_file_name)
## [1] 100

Then we convert testing dataset images into 2D array.

glioma_tumor_y <- image_prep(glioma_tumor_file_name)

dim(glioma_tumor_y)
## [1] 100 200 200   3

glioma_tumor testing data consist of 100 images with 200 x 200 dimension and color channel 3 (RGB).

Since we already prepared the testing dataset, we can predict the label for each image using the model_big because this model has a higher accuracy and prediction values. Then we summarize the frequency of each predicted label to check its result accuracy.

# Predict label on array
pred_test_glioma_tumor <- predict_classes(model_big, glioma_tumor_y)

# Convert encoding to label
decode <- function(x){
  case_when(x == 0 ~ "glioma_tumor",
            x == 1 ~ "meningioma_tumor",
            x == 2 ~ "no_tumor",
            x == 3 ~ "pituitary_tumor"
            )
}

# Create data submission
glioma_tumor_result <- data.frame(id = glioma_tumor_file_name,
                         label = sapply(pred_test_glioma_tumor, decode)
                         ) %>%
  mutate(id = str_remove(id, "data/testing2/glioma_tumor/")) # remove file path and only keep the file name

# check first 3 data
head(glioma_tumor_result, 10)

glioma_tumor_result %>% 
  group_by(label) %>% 
  summarise(freq = n())

From the evaluation and summarized prediction result, model_big can predict the label or class quite well. 96 images from the total of 100 images (around 96% from the total test images) already predicted correctly as glioma_tumor.

Evaluating the model with no_tumor testing dataset

no_tumor_folder_path <- paste0("data/testing2/no_tumor/")

no_tumor_folder_path
## [1] "data/testing2/no_tumor/"
# Get file name
no_tumor_file_name <- map(no_tumor_folder_path, 
                 function(x) paste0(x, list.files(x))
                 ) %>% 
  unlist()

# first 6 file name
head(no_tumor_file_name)
## [1] "data/testing2/no_tumor/image(1).jpg"  
## [2] "data/testing2/no_tumor/image(10).jpg" 
## [3] "data/testing2/no_tumor/image(100).jpg"
## [4] "data/testing2/no_tumor/image(101).jpg"
## [5] "data/testing2/no_tumor/image(102).jpg"
## [6] "data/testing2/no_tumor/image(103).jpg"
# last 6 file name
tail(no_tumor_file_name)
## [1] "data/testing2/no_tumor/image(95).jpg"
## [2] "data/testing2/no_tumor/image(96).jpg"
## [3] "data/testing2/no_tumor/image(97).jpg"
## [4] "data/testing2/no_tumor/image(98).jpg"
## [5] "data/testing2/no_tumor/image(99).jpg"
## [6] "data/testing2/no_tumor/image.jpg"
length(no_tumor_file_name)
## [1] 105

Then we convert testing dataset images into 2D array.

no_tumor_y <- image_prep(no_tumor_file_name)

dim(no_tumor_y)
## [1] 105 200 200   3

no_tumor testing data consist of 105 images with 200 x 200 dimension and color channel 3 (RGB).

Since we already prepared the testing dataset, we can predict the label for each image using the model_big because this model has a higher accuracy and prediction values. Then we summarize the frequency of each predicted label to check its result accuracy.

# Predict label on array
pred_test_no_tumor <- predict_classes(model_big, no_tumor_y)

# Convert encoding to label
decode <- function(x){
  case_when(x == 0 ~ "glioma_tumor",
            x == 1 ~ "meningioma_tumor",
            x == 2 ~ "no_tumor",
            x == 3 ~ "pituitary_tumor"
            )
}

# Create data submission
no_tumor_result <- data.frame(id = no_tumor_file_name,
                         label = sapply(pred_test_no_tumor, decode)
                         ) %>%
  mutate(id = str_remove(id, "data/testing2/no_tumor/")) # remove file path and only keep the file name

# check first 3 data
head(no_tumor_result, 10)

no_tumor_result %>% 
  group_by(label) %>% 
  summarise(freq = n())

From the evaluation and summarized prediction result, model_big can predict the label or class quite well. 104 images from the total of 105 images (around 99% from the total test images) already predicted correctly as no_tumor.

Evaluating the model with pituitary_tumor testing dataset

pituitary_tumor_folder_path <- paste0("data/testing2/pituitary_tumor/")

pituitary_tumor_folder_path
## [1] "data/testing2/pituitary_tumor/"
# Get file name
pituitary_tumor_file_name <- map(pituitary_tumor_folder_path, 
                 function(x) paste0(x, list.files(x))
                 ) %>% 
  unlist()

# first 6 file name
head(pituitary_tumor_file_name)
## [1] "data/testing2/pituitary_tumor/image (1).jpg" 
## [2] "data/testing2/pituitary_tumor/image (10).jpg"
## [3] "data/testing2/pituitary_tumor/image (11).jpg"
## [4] "data/testing2/pituitary_tumor/image (12).jpg"
## [5] "data/testing2/pituitary_tumor/image (13).jpg"
## [6] "data/testing2/pituitary_tumor/image (14).jpg"
# last 6 file name
tail(pituitary_tumor_file_name)
## [1] "data/testing2/pituitary_tumor/image(94).jpg"
## [2] "data/testing2/pituitary_tumor/image(95).jpg"
## [3] "data/testing2/pituitary_tumor/image(96).jpg"
## [4] "data/testing2/pituitary_tumor/image(97).jpg"
## [5] "data/testing2/pituitary_tumor/image(98).jpg"
## [6] "data/testing2/pituitary_tumor/image.jpg"
length(pituitary_tumor_file_name)
## [1] 74

Then we convert testing dataset images into 2D array.

pituitary_tumor_y <- image_prep(pituitary_tumor_file_name)

dim(pituitary_tumor_y)
## [1]  74 200 200   3

pituitary_tumor testing data consist of 74 images with 200 x 200 dimension and color channel 3 (RGB).

Since we already prepared the testing dataset, we can predict the label for each image using the model_big because this model has a higher accuracy and prediction values. Then we summarize the frequency of each predicted label to check its result accuracy.

# Predict label on array
pred_test_pituitary_tumor <- predict_classes(model_big, pituitary_tumor_y)

# Convert encoding to label
decode <- function(x){
  case_when(x == 0 ~ "glioma_tumor",
            x == 1 ~ "meningioma_tumor",
            x == 2 ~ "no_tumor",
            x == 3 ~ "pituitary_tumor"
            )
}

# Create data submission
pituitary_tumor_result <- data.frame(id = pituitary_tumor_file_name,
                         label = sapply(pred_test_pituitary_tumor, decode)
                         ) %>%
  mutate(id = str_remove(id, "data/testing2/pituitary_tumor/")) # remove file path and only keep the file name

# check first 3 data
head(pituitary_tumor_result, 10)

pituitary_tumor_result %>% 
  group_by(label) %>% 
  summarise(freq = n())

From the evaluation and summarized prediction result, model_big can predict the label or class quite well. 48 images from the total of 74 images (around 65% from the total test images) already predicted correctly as pituitary_tumor.

Conclusion

Based on the model_big performance and evaluation result, we can conclude that the objective, to create a good model using CNN, already achieved. Nevertheless the model itself can be improved more and use as a reference by others to help doctors identify the tumor type more correctly and to help lots people getting treated correctly.