Calorie Calculator - FlickR Image Classifier

Group 6: Rohit Madhu, Gokul Nedunsezhian, Jai Krishna Mounaguru, Rohan Pathak, Balajhi Shanmugam Selvakumar

Problem Description

Summary of Peers comments

Best Suggestion

As image classification model is time consuming to train on, why not consider the use of transfer learning techniques to leverage pre-trained models and improve the training efficiency and accuracy of the food image classification model. By using a pre-trained model, the model can leverage the features learned by the pre-trained model on a large dataset and retrain it on the food image dataset, which can help to reduce the amount of training required to achieve high accuracy. This can lead to faster development and better overall performance of the food image classification model.

Analytics Plan

Data Scraping

# Specify the food items to download
food_items <- c("burger", "banana", "apple", "pasta")

.....

# Make the API request for this page
photos <- getPhotoSearch(api_key = "a78a61870fb226f1aa6e348cd78c075e",
                             tags = food_item,
                             extras = "url_o",
                             img_size = "m",
                             per_page = per_page,
                             page = page,
                             sort = "interestingness-desc")

......
    
# Loop through and download 
download.file(url, filename)

Data Summary

Though the target was 22000, due to download issues we got to download 91% of the scraped data.

592 of these images were manually tagged for the purpose of testing.

60% of the data was used for training and the rest 40% for validation.

Data Exploration

## Number of Training image files: 19526
## Number of Test image files: 592
## Number of Image files: 20118

Data Processing

# Get the list of labels for each class
label_list <- dir("train/")
output_n <- length(label_list)
# Save the list of labels to a file
save(label_list, file="label_list.R")

# Set the dimensions for the input images
width <- 224
height<- 224
target_size <- c(width, height)
rgb <- 3 #color channels

# Specify the path to the training data and create a data generator
path_train <- "train/"
train_data_gen <- image_data_generator(rescale = 1/255, 
                                       validation_split = .6)

Model Building

Evaluation

Confusion Matrix

##            true_labels
## pred_labels   0   1   2   3
##      apple  108   1   0   0
##      banana  29 193   1   0
##      burger   0   0 101   0
##      pasta    0   0   1 158

Looking at the matrix, we can see that the model performs well in correctly identifying the apple and pasta classes, as there are no false predictions for these classes. However, it is less accurate in predicting the banana and burger classes, with 29 false predictions for banana and 1 false prediction for burger.

Class level Accuracy, Precision & Recall

## [1] "78.83%" "99.48%" "98.06%" "100%"
## [1] "78.83%" "99.48%" "98.06%" "100%"
## [1] "99.08%" "86.55%" "100%"   "99.37%"

The overall accuracy of the model is 94.6%.

The model has high precision for all classes, ranging from 86.55% for banana to 100% for burger and pasta.

And has high recall for all classes, ranging from 98.06% for banana to 100% for all other classes.

Predicting a random image

Probability
pasta 93.12 %
banana 5.53 %
burger 1.24 %
apple 0.11 %

Determining the nutritional info

## Food Name:  PASTA
## Nutrients Data:
Nutrient Name Unit Name Value Percent Daily Value
6 Thiamin MG 1.000 30
8 Niacin MG 5.360 15
13 Carbohydrate, by difference G 78.600 15
2 Iron, Fe MG 3.210 10
7 Riboflavin MG 0.304 10
16 Fiber, total dietary G 3.600 8
12 Total lipid (fat) G 1.790 2
1 Calcium, Ca MG 0.000 0
3 Sodium, Na MG 0.000 0
4 Vitamin A, IU IU 0.000 0
5 Vitamin C, total ascorbic acid MG 0.000 0
9 Cholesterol MG 0.000 0
10 Fatty acids, total saturated G 0.000 0
11 Protein G 12.500 0
14 Energy KCAL 375.000 0
15 Sugars, total including NLEA G 3.570 0
17 Fatty acids, total trans G 0.000 0
## It contains 375 calories.