1 Introduction

1.1 Introduction

This report will show you the process to make a machine learning model for image classification to classify all types of landscape enviroment such as Coast, Desert, Forest, Glacier, and Mountain. Therefore the classes for this task is :

  • Coast This class contains images belonging to coastal areas, or simply beaches.
  • Desert This class contains images of desert areas such as Sahara Thar, etc.
  • Forest This class is filled with images belonging to forest areas such as Amazon.
  • Glacier This class consists of some amazing white images, These images belongs to glaciers. For example, the Antarctic.
  • Mountains This class shows you the world from the top i.e. the mountain areas such as the Himalayas.

The dataset was obtained from : https://www.kaggle.com/datasets/utkarshsaxenadn/landscape-recognition-image-dataset-12k-images

1.2 Package

To make the model we will use the following library in R:

# Data wrangling
library(tidyverse)
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr     1.1.2     ✔ readr     2.1.4
## ✔ forcats   1.0.0     ✔ stringr   1.5.0
## ✔ ggplot2   3.4.2     ✔ tibble    3.2.1
## ✔ lubridate 1.9.2     ✔ tidyr     1.3.0
## ✔ purrr     1.0.1     
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
# Image manipulation
library(imager)
## Loading required package: magrittr
## 
## Attaching package: 'magrittr'
## 
## The following object is masked from 'package:purrr':
## 
##     set_names
## 
## The following object is masked from 'package:tidyr':
## 
##     extract
## 
## 
## Attaching package: 'imager'
## 
## The following object is masked from 'package:magrittr':
## 
##     add
## 
## The following object is masked from 'package:stringr':
## 
##     boundary
## 
## The following object is masked from 'package:dplyr':
## 
##     where
## 
## The following object is masked from 'package:tidyr':
## 
##     fill
## 
## The following objects are masked from 'package:stats':
## 
##     convolve, spectrum
## 
## The following object is masked from 'package:graphics':
## 
##     frame
## 
## The following object is masked from 'package:base':
## 
##     save.image
# Deep learning
library(keras)

# Model Evaluation
library(caret)
## Loading required package: lattice
## 
## Attaching package: 'caret'
## 
## The following object is masked from 'package:purrr':
## 
##     lift

Also we need python 3.10 interpeter installed in our R studio to execute function from its libraries such as Tensorflow and pillow in an enviroment to help us make the model.

2 Data Input and Data Preparation

The training data is available in 30 different folders, and the names of these folders represent the labels for the target classes. Hence, we need to obtain data from each image file that has been labeled.

Firstly, we need to identify the folders for each target class:

folder_list <- list.files("Landscape Classification/Landscape Classification/Training Data//")

folder_list
## [1] "Coast"    "Desert"   "Forest"   "Glacier"  "Mountain"

Next, we will combine the folder names with the path or directory of the train folder to access the content inside each folder.

folder_path <- paste0("Landscape Classification/Landscape Classification/Training Data//", folder_list, "/")

folder_path
## [1] "Landscape Classification/Landscape Classification/Training Data//Coast/"   
## [2] "Landscape Classification/Landscape Classification/Training Data//Desert/"  
## [3] "Landscape Classification/Landscape Classification/Training Data//Forest/"  
## [4] "Landscape Classification/Landscape Classification/Training Data//Glacier/" 
## [5] "Landscape Classification/Landscape Classification/Training Data//Mountain/"

Then, we will use the map() function to iterate through and collect the file names for each folder (beach, forest, mountain).

# Get file name
file_name <- map(folder_path, 
                 function(x) paste0(x, list.files(x))
                 ) %>% 
  unlist()

# first 6 file name
head(file_name)
## [1] "Landscape Classification/Landscape Classification/Training Data//Coast/Coast-Train (1).jpeg"   
## [2] "Landscape Classification/Landscape Classification/Training Data//Coast/Coast-Train (10).jpeg"  
## [3] "Landscape Classification/Landscape Classification/Training Data//Coast/Coast-Train (100).jpeg" 
## [4] "Landscape Classification/Landscape Classification/Training Data//Coast/Coast-Train (1000).jpeg"
## [5] "Landscape Classification/Landscape Classification/Training Data//Coast/Coast-Train (1001).jpeg"
## [6] "Landscape Classification/Landscape Classification/Training Data//Coast/Coast-Train (1002).jpeg"

Assessing the number of images to be used :

length(file_name)
## [1] 10000

There are 829 images in the training data that will be used to train the model.

3 Exploratory Data

3.1 Image Dimension

Dimension is one of the crucial aspects in image classification. Ensuring the correct dimension of input data is essential for the model to process information effectively.

# Full Image Description
img <- load.image(file_name[1])
img
## Image. Width: 275 pix Height: 183 pix Depth: 1 Colour channels: 3
# Image Dimension
dim(img)
## [1] 275 183   1   3

From the two pieces of information above, we can deduce details about one of the images in the training data:

The image has:

  • Width: 275 pixels

  • Height: 183 pixels

Additionally, the color channel for the image is 3, which indicates that the image is in RGB format. RGB format means that the image has colors other than just black and white, and it consists of three color channels representing the Red, Green, and Blue color components. This allows the image to display various colors.

# Function for acquiring width and height of an image
get_dim <- function(x){
  img <- load.image(x) 
  
  df_img <- data.frame(height = height(img),
                       width = width(img),
                       filename = x
                       )
  
  return(df_img)
}

get_dim(file_name[1])

The above code is a function that can provide the height and width values of an image and convert them into a data frame. Next, it takes a sample of 300 images from the file names and retrieves their height and width.

sample_file <- sample(file_name)

# Run the get_dim() function for each image
file_dim <- map_df(sample_file, get_dim)

head(file_dim, 10)
summary(file_dim)
##      height          width         filename        
##  Min.   : 97.0   Min.   :133.0   Length:10000      
##  1st Qu.:174.0   1st Qu.:262.0   Class :character  
##  Median :183.0   Median :275.0   Mode  :character  
##  Mean   :190.2   Mean   :271.3                     
##  3rd Qu.:193.0   3rd Qu.:289.0                     
##  Max.   :347.0   Max.   :517.0

In the creation of machine learning models, especially in deep learning or Convolutional Neural Network models, image dimensions need to be considered to have the same width and height for efficient processing by the model. Consistency and standardization can also be influenced by image dimensions. In the machine learning workflow, consistency and standardization are crucial for reproducibility and for use by other users.

The lowest dimension in height is 97 pixels, and the highest dimension is 347 pixels in the input image data. The lowest dimension in width is 133 pixels, and the highest dimension is 517 pixels in the input image data. It can also be observed from the above summary that the input images have variations in dimensions for both width and height. Therefore, image resizing is necessary to make the images have the same dimensions.

4 Preprocessing Data and Augmentation

Given the varying dimensions in the images, image resizing is performed to transform them into a consistent size of 64x64 pixels, ensuring that each image has the same height and width. The choice of this size is made with the expectation of minimizing data loss while also ensuring a relatively short model creation time. By using this size, we aim to strike a balance between preserving relevant information in the images and optimizing the efficiency of the model development process.

# Desired height and width of images
target_size <- c(64, 64)

# Batch size for training the model
batch_size <- 100

In the above code, a batch size of 100 is selected. The model will be updated after each batch during the training process.

Next, several data augmentations are applied using the image_data_generator() function:

  • The rescale parameter performs data scaling, where each pixel value in the image is divided by 255, resulting in pixel data ranging from 0 to 1.

  • The horizontal_flip parameter randomly flips the image horizontally during data augmentation.

  • The vertical_flip parameter randomly flips the image vertically during data augmentation.

  • The rotation_range parameter determines the range of degrees (in this case, 0 to 45 degrees) by which the image can be randomly rotated during data augmentation. Random rotation can help the model become more robust to various object orientations in the images.

  • Data splitting with validation_split divides the dataset into training and validation sets. In this model, 20% of the data will be used as the validation data, and the remaining 80% will be used for training. By separating the data into training and validation sets, we can evaluate how well the model generalizes to unseen data. The training data is used to train the model, and the validation data allows us to assess its performance on data it has not seen before. This helps avoid overfitting.

# Image Generator
train_data_gen <- image_data_generator(rescale = 1/255, # Scaling pixel value
                                       horizontal_flip = T, # Flip image horizontally
                                       vertical_flip = T, # Flip image vertically 
                                       rotation_range = 45, # Rotate image from 0 to 45 degrees
                                       validation_split = 0.2 # 20% data as validation data
                                       )

Next, we will input the image data into the generator using flow_images_from_directory() and apply data augmentation to both the training and validation data.

# Training Dataset
train_image_array_gen <- flow_images_from_directory(directory = "Landscape Classification/Landscape Classification/Training Data/", # Folder of the data
                                                    target_size = target_size, # target of the image dimension  
                                                    color_mode = "rgb", # use RGB color
                                                    batch_size = batch_size , 
                                                    seed = 123,  # set random seed
                                                    subset = "training", # declare that this is for training data
                                                    generator = train_data_gen,

                                                    )

# Validation Dataset
val_image_array_gen <- flow_images_from_directory(directory = "Landscape Classification/Landscape Classification/Training Data/",
                                                  target_size = target_size, 
                                                  color_mode = "rgb", 
                                                  batch_size = batch_size ,
                                                  seed = 123,
                                                  subset = "validation", # declare that this is the validation data
                                                  generator = train_data_gen,
                                                  )

In the above code, the color_mode is set to RGB, considering that the images have 3 color channels and are not grayscale images.

Next, we will examine the proportion of labels in the training data to see if there is any class imbalance. Class imbalance can lead to model performance issues, including bias towards the majority class and poor generalization, which may result in overfitting.

# Number of training samples
train_samples <- train_image_array_gen$n

# Number of validation samples
valid_samples <- val_image_array_gen$n

# Number of target classes/categories
output_n <- n_distinct(train_image_array_gen$classes)

# Get the class proportion
table("\nFrequency" = factor(train_image_array_gen$classes)
      ) %>% 
  prop.table()
## 
## Frequency
##   0   1   2   3   4 
## 0.2 0.2 0.2 0.2 0.2

Dapat dilihat bahwa proporsi data cukup seimbang dan tidak imbalance. Sehingga tidak dilakukan scalling data.

5 Model Building : Base Model

5.1 Model Architecture

For image classification, a Convolutional Neural Network (CNN) is used. The architecture of CNN is well-suited for image recognition and classification tasks, as it can process a large amount of data and generate highly accurate predictions. CNN can learn object features through multiple iterations, eliminating the need for manual feature engineering, such as feature extraction. It is a popular subtype of Neural Networks widely used in image and speech recognition applications.

The built-in convolutional layers in CNN can reduce the high dimensionality of images without risking the loss of important information within the images. This allows CNN to effectively capture relevant patterns and features from the images, making it a powerful tool for image-related tasks.

First, we will create a simple Convolutional Neural Network model with the following layers:

  • A convolutional layer to extract features from 2D images using the ReLU activation function.

  • A max-pooling layer to downsample the extracted features.

  • A flattening layer to convert the 2D data array into a 1D array. Flatten is used in CNN to transform the 2D feature representation (result from the convolutional layer) into a 1D vector. This is a crucial step that allows the CNN to transition from the convolutional part to the fully connected (dense) part, where we can use fully-connected dense layers leading to the output layer.

  • Dense layers to capture more information.

  • An output dense layer using the softmax activation function because there are more than two classes in the label.

These layers are designed to enable the CNN to efficiently learn and classify the features present in the input images, and the final dense layer with the softmax activation will provide the probability distribution over the multiple classes, allowing us to make accurate predictions for multi-class classification tasks.

# input shape of the image
c(target_size, 3) 
## [1] 64 64  3
# Set Initial Random Weight
tensorflow::tf$random$set_seed(123)

model_base <- keras_model_sequential(name = "model_base") %>% 
  
  # Convolution Layer
  layer_conv_2d(filters = 32,
                kernel_size = c(3,3),
                padding = "same",
                activation = "relu",
                input_shape = c(target_size, 3) 
                ) %>% 

  # Max Pooling Layer
  layer_max_pooling_2d(pool_size = c(2,2)) %>% 
  
  # Flattening Layer
  layer_flatten() %>% 
  
  # Dense Layer
  layer_dense(units = 16,
              activation = "relu") %>% 
  
  # Output Layer
  layer_dense(units = output_n,
              activation = "softmax",
              name = "Output")
  
model_base
## Model: "model_base"
## ________________________________________________________________________________
##  Layer (type)                       Output Shape                    Param #     
## ================================================================================
##  conv2d (Conv2D)                    (None, 64, 64, 32)              896         
##  max_pooling2d (MaxPooling2D)       (None, 32, 32, 32)              0           
##  flatten (Flatten)                  (None, 32768)                   0           
##  dense (Dense)                      (None, 16)                      524304      
##  Output (Dense)                     (None, 5)                       85          
## ================================================================================
## Total params: 525,285
## Trainable params: 525,285
## Non-trainable params: 0
## ________________________________________________________________________________

5.2 Model Fitting

Next, the data was fitted into the model. The model’s label class involves multilabel classification, hence ‘categorical cross-entropy’ was chosen as the appropriate loss function. Subsequently, 30 epochs were utilized as the number of iterations, and the ‘adam’ optimizer with a learning rate of 0.01 was applied.

In the creation of this model, the accuracy metric was also selected as it is the most relevant in this context. We are primarily concerned with the correct classification of images, and the accuracy metric measures the proportion of correct predictions made by the model in comparison to the total number of predictions conducted.

model_base %>% 
  compile(
    loss = "categorical_crossentropy",
    optimizer = optimizer_adam(lr = 0.001),
    metrics = "accuracy"
  )

# Fit data into model
history <- model_base %>% 
  fit(
  # training data
  train_image_array_gen,

  # training epochs
  steps_per_epoch = as.integer(train_samples / batch_size), 
  epochs = 30, 
  
  # validation data
  validation_data = val_image_array_gen,
  validation_steps = as.integer(valid_samples / batch_size)
)

plot(history)

If we look at the training results, the performance comparison of the model between the training data and validation data is quite good, and there is no overfitting.

5.3 Model Evaluation

Next, we will evaluate and obtain the confusion matrix using the validation data from the generator. First, we will extract the file names of the images used in the validation data. From these file names, we will perform label extraction to obtain the actual category values for the target variable.

val_data <- data.frame(file_name = paste0("Landscape Classification/Landscape Classification/Training Data/", val_image_array_gen$filenames)) %>% 
  mutate(class = str_extract(file_name, "Coast|Desert|Forest|Glacier|Mountain"))

head(val_data, 10)

Next, we will input the images into R by converting them into arrays. Since the input dimensions for the CNN model are images with a size of 64 x 64 pixels and 3 color channels (RGB), the same process will be applied to the images from the test data. This involves converting each image into a 64 x 64 x 3 array representation so that they can be fed into the CNN model for prediction.

# Function to convert image to array
image_prep <- function(x) {
  arrays <- lapply(x, function(path) {
    img <- image_load(path, target_size = target_size, 
                      grayscale = F # Set FALSE if image is RGB
                      )
    
    x <- image_to_array(img)
    x <- array_reshape(x, c(1, dim(x)))
    x <- x/255 # rescale image pixel
  })
  do.call(abind::abind, c(arrays, list(along = 1)))
}
test_x <- image_prep(val_data$file_name)

# Check dimension of testing data set
dim(test_x)
## [1] 2000   64   64    3

The validation data consists of 2000 images with dimensions of 64 x 64 pixels and 3 color channels (RGB). Once the testing data is prepared, the next step is to predict the labels for each image using the CNN model that has been created. By feeding the prepared validation data into the trained CNN model, we can obtain predictions for each image and evaluate the model’s performance on unseen data.

pred_test <- model_base %>% predict(x = test_x) %>% k_argmax() 

head(pred_test, 10)
## tf.Tensor([0 0 0 2 0 0 0 0 0 0], shape=(10), dtype=int64)
# Convert encoding to label
decode <- function(x){
  case_when(x == 0 ~ "Coast",
            x == 1 ~ "Desert",
            x == 2 ~ "Forest",
            x == 3 ~ "Glacier",
            x == 4 ~ "Mountain",
            
            )
}

pred_test <- sapply(pred_test, decode) 

head(pred_test, 10)
##  [1] "Coast"  "Coast"  "Coast"  "Forest" "Coast"  "Coast"  "Coast"  "Coast" 
##  [9] "Coast"  "Coast"
confusionMatrix(as.factor(pred_test), 
                as.factor(val_data$class)
                )
## Confusion Matrix and Statistics
## 
##           Reference
## Prediction Coast Desert Forest Glacier Mountain
##   Coast      330     70     19      97      202
##   Desert      34    305     11       1       40
##   Forest      14     17    346      13       59
##   Glacier     16      1      4     288       28
##   Mountain     6      7     20       1       71
## 
## Overall Statistics
##                                           
##                Accuracy : 0.67            
##                  95% CI : (0.6489, 0.6906)
##     No Information Rate : 0.2             
##     P-Value [Acc > NIR] : < 2.2e-16       
##                                           
##                   Kappa : 0.5875          
##                                           
##  Mcnemar's Test P-Value : < 2.2e-16       
## 
## Statistics by Class:
## 
##                      Class: Coast Class: Desert Class: Forest Class: Glacier
## Sensitivity                0.8250        0.7625        0.8650         0.7200
## Specificity                0.7575        0.9463        0.9356         0.9694
## Pos Pred Value             0.4596        0.7801        0.7706         0.8546
## Neg Pred Value             0.9454        0.9410        0.9652         0.9327
## Prevalence                 0.2000        0.2000        0.2000         0.2000
## Detection Rate             0.1650        0.1525        0.1730         0.1440
## Detection Prevalence       0.3590        0.1955        0.2245         0.1685
## Balanced Accuracy          0.7913        0.8544        0.9003         0.8447
##                      Class: Mountain
## Sensitivity                   0.1775
## Specificity                   0.9788
## Pos Pred Value                0.6762
## Neg Pred Value                0.8264
## Prevalence                    0.2000
## Detection Rate                0.0355
## Detection Prevalence          0.0525
## Balanced Accuracy             0.5781

The confusion matrix displays the counts of actual class labels versus predicted class labels. Each row represents the true class, while each column represents the predicted class.

In this case, the model is trained to classify images into five categories: Coast, Desert, Forest, Glacier, and Mountain. The numbers in the cells of the matrix indicate how many instances were correctly classified (diagonal elements) and how many instances were misclassified (off-diagonal elements).

For example:

Coast class: 180 instances were correctly classified as Coast (True Positives), while 31 instances of Coast were misclassified as Desert, 15 as Forest, 126 as Glacier, and 48 as Mountain

In this type of image classification model, the obtained metric results are not satisfactory, as the data has accuracy metrics below 75%. This indicates that the model’s performance is not meeting the desired level of accuracy for the task at hand. Therefore we need to make improvement to our model.

6 Model Building : Tuned model

Next, we will perform model tuning with the hope of achieving better performance than the previous model.

6.1 Model Architecture

The new model will be built with an architecture that has more layers than the previous model, as follows:

Convolutional Layer 1 to extract features from 2D images with the ReLU activation function.

Max pooling layer.

Batch normalization layer.

Convolutional Layer 2 to extract features from 2D images with the ReLU activation function.

Max pooling layer.

Batch normalization layer.

Convolutional Layer 3 to extract features from 2D images with the ReLU activation function.

Max pooling layer.

Batch normalization layer.

Flatten layer to convert the 2D array into a 1D array.

Dense layer to capture more information.

Dense layer for the output layer.

By adding more layers and utilizing techniques such as batch normalization, the new model aims to enhance feature extraction capabilities and improve its ability to capture complex patterns in the data. The final dense layer with the appropriate activation function (e.g., softmax for multi-class classification) will allow the model to produce accurate predictions for the image classes. Through model tuning and careful evaluation, we expect this new architecture to achieve improved performance. The dropout layer is added with purpose to minimize the overfitting in the model.

tensorflow::tf$random$set_seed(123)

model_tuned <- keras_model_sequential(name = "model_tuned") %>% 
  layer_conv_2d(filters = 64, kernel_size = c(3, 3), padding = "same", activation = "relu", input_shape = c(target_size, 3)) %>% 
  layer_max_pooling_2d(pool_size = c(2, 2)) %>% 
  layer_conv_2d(filters = 128, kernel_size = c(3, 3), padding = "same", activation = "relu") %>% 
  layer_max_pooling_2d(pool_size = c(2, 2)) %>% 
  layer_conv_2d(filters = 256, kernel_size = c(3, 3), padding = "same", activation = "relu") %>% 
  layer_max_pooling_2d(pool_size = c(2, 2)) %>% 
  layer_flatten() %>% 
  layer_dense(units = 128, activation = "relu") %>%
  layer_dense(units = 256, activation = "relu") %>% 
  
  layer_dropout(rate = 0.5) %>% 
  layer_dense(units = 5, activation = "softmax", name = "Output")

model_tuned
## Model: "model_tuned"
## ________________________________________________________________________________
##  Layer (type)                       Output Shape                    Param #     
## ================================================================================
##  conv2d_3 (Conv2D)                  (None, 64, 64, 64)              1792        
##  max_pooling2d_3 (MaxPooling2D)     (None, 32, 32, 64)              0           
##  conv2d_2 (Conv2D)                  (None, 32, 32, 128)             73856       
##  max_pooling2d_2 (MaxPooling2D)     (None, 16, 16, 128)             0           
##  conv2d_1 (Conv2D)                  (None, 16, 16, 256)             295168      
##  max_pooling2d_1 (MaxPooling2D)     (None, 8, 8, 256)               0           
##  flatten_1 (Flatten)                (None, 16384)                   0           
##  dense_2 (Dense)                    (None, 128)                     2097280     
##  dense_1 (Dense)                    (None, 256)                     33024       
##  dropout (Dropout)                  (None, 256)                     0           
##  Output (Dense)                     (None, 5)                       1285        
## ================================================================================
## Total params: 2,502,405
## Trainable params: 2,502,405
## Non-trainable params: 0
## ________________________________________________________________________________

6.2 Model Fitting

The same fitting method as the initial model was employed, which involved utilizing the ‘categorical cross-entropy’ as the selected loss function. Subsequently, 50 epochs were executed as the number of iterations, and the ‘adam’ optimizer with a learning rate of 0.001 was employed.

model_tuned %>% 
  compile(
    loss = "categorical_crossentropy",
    optimizer = optimizer_adam(learning_rate = 0.001),
    metrics = "accuracy"
  )

# Fit data into model
history <- model_tuned %>% 
  fit(
  # training data
  train_image_array_gen,

  # training epochs
  steps_per_epoch = as.integer(train_samples / batch_size), 
  epochs = 50, 
  
  # validation data
  validation_data = val_image_array_gen,
  validation_steps = as.integer(valid_samples / batch_size)
)

plot(history)

6.3 Model evaluation

pred_test_tuned <- model_tuned %>% predict(x = test_x) %>% k_argmax() 

head(pred_test_tuned, 10)
## tf.Tensor([0 0 0 4 4 0 3 0 0 0], shape=(10), dtype=int64)
decode <- function(x){
  case_when(x == 0 ~ "Coast",
            x == 1 ~ "Desert",
            x == 2 ~ "Forest",
            x == 3 ~ "Glacier",
            x == 4 ~ "Mountain",
            
            )
}

pred_test_tuned <- sapply(pred_test_tuned, decode) 

head(pred_test_tuned, 10)
##  [1] "Coast"    "Coast"    "Coast"    "Mountain" "Mountain" "Coast"   
##  [7] "Glacier"  "Coast"    "Coast"    "Coast"
confusionMatrix(as.factor(pred_test_tuned), 
                as.factor(val_data$class)
                )
## Confusion Matrix and Statistics
## 
##           Reference
## Prediction Coast Desert Forest Glacier Mountain
##   Coast      291     29     11      18       45
##   Desert      24    341     13       1       52
##   Forest      13      7    345      12       84
##   Glacier     45      4      4     347       39
##   Mountain    27     19     27      22      180
## 
## Overall Statistics
##                                           
##                Accuracy : 0.752           
##                  95% CI : (0.7325, 0.7708)
##     No Information Rate : 0.2             
##     P-Value [Acc > NIR] : < 2.2e-16       
##                                           
##                   Kappa : 0.69            
##                                           
##  Mcnemar's Test P-Value : 8.685e-12       
## 
## Statistics by Class:
## 
##                      Class: Coast Class: Desert Class: Forest Class: Glacier
## Sensitivity                0.7275        0.8525        0.8625         0.8675
## Specificity                0.9356        0.9437        0.9275         0.9425
## Pos Pred Value             0.7386        0.7912        0.7484         0.7904
## Neg Pred Value             0.9321        0.9624        0.9643         0.9660
## Prevalence                 0.2000        0.2000        0.2000         0.2000
## Detection Rate             0.1455        0.1705        0.1725         0.1735
## Detection Prevalence       0.1970        0.2155        0.2305         0.2195
## Balanced Accuracy          0.8316        0.8981        0.8950         0.9050
##                      Class: Mountain
## Sensitivity                   0.4500
## Specificity                   0.9406
## Pos Pred Value                0.6545
## Neg Pred Value                0.8725
## Prevalence                    0.2000
## Detection Rate                0.0900
## Detection Prevalence          0.1375
## Balanced Accuracy             0.6953

In this image classification mode(model_tuned) , the obtained metric results are satisfactory, as the data has accuracy metrics above 75%. This indicates that the model_tuned’s performance is meeting the desired level of accuracy for the task at hand.

7 Conclusion

Several conclusions can be drawn from this project:

  • Machine learning, especially in the form of Convolutional Neural Networks (CNN), is an effective approach for image classification tasks. The utilization of CNN models has resulted in good performance in image classification (Accuracy above 75 % ).

  • The use of accuracy as an evaluation metric is relevant and appropriate for this image classification task. Accuracy provides insights into how well the model can make correct predictions regarding the image labels.

  • The significance of employing CNN architectures for image classification tasks. CNNs consist of convolutional layers that can extract crucial features from images, enhancing the model’s ability to understand relevant patterns and characteristics in the images.

  • The final conclusion is that this project has successfully achieved the goal of image classification using machine learning models and can serve as a foundation for further advancements or practical implementations to provide additional benefits