Matrix Factorization With SVD


Summary

This is an R Markdown document for performing analysis of icecream ratings by little kids and to recommend the new / untried flavors to them this summer. We explore the Matrix Factorization Techniques For Recommendation Systems. We do this factorization using Singular Value Decomposition.

knitr::opts_chunk$set(message = FALSE, echo = TRUE)

# To load survey data from googlesheets
suppressWarnings(suppressMessages(library(googlesheets)))
# Library for loading CSV data
library(RCurl)
# Library for data tidying
library(tidyr)
# Library for data structure operations
library(dplyr)
library(knitr)
# Library for plotting
library(ggplot2)
# Library for data display in tabular format
library(DT)
library(pander)


library(Matrix)
suppressWarnings(suppressMessages(library(recommenderlab)))

Loading The IceCream Survey Data

The YumYum IceCream Shop has created a survey for the regular kids to rate their flavors.

Here is the survey link:

https://docs.google.com/forms/d/e/1FAIpQLSdk2Xgop-XCcTXR2XEQW3pFV9l0e_VjBFMjTWvX1ttqK3fMZg/viewform

The responses from survey can be found here :

https://docs.google.com/spreadsheets/d/1IKwsU5KjG6Y00Cg5F2ZDCUUYuqjPw8tG67ql3sDqIvc/edit?usp=sharing

# Loading data from googlesheets, first finding the relevant sheet , reading the
# sheet and relevant worksheet

gs_ls()
## # A tibble: 5 x 10
##                sheet_title        author  perm version             updated
##                      <chr>         <chr> <chr>   <chr>              <dttm>
## 1             YumYumSummer java.messagi<U+0085>    rw     new 2017-06-18 21:43:37
## 2          YumYum IceCream java.messagi<U+0085>    rw     new 2017-06-16 18:06:05
## 3 IS606 Fall 2016 Present<U+0085>   jason.bryer    rw     new 2017-03-15 23:05:15
## 4     Untitled spreadsheet kumudini.bha<U+0085>    rw     new 2016-12-21 20:47:52
## 5 IS606 Spring 2016 Prese<U+0085>   jason.bryer    rw     new 2016-05-22 05:29:14
## # ... with 5 more variables: sheet_key <chr>, ws_feed <chr>,
## #   alternate <chr>, self <chr>, alt_key <chr>
icedata.url <- gs_title("YumYumSummer")
icedata.csv <- gs_read_csv(ss = icedata.url, ws = "Summer")

# convert to data.frame
icedata <- as.data.frame(icedata.csv)


# Verifying records and variables

nrow(icedata)
## [1] 14
ncol(icedata)
## [1] 27
# datatable(icedata)

Data Exploration

icedataorig <- icedata
icedata <- icedata %>% select(-Timestamp, -Name)


# creating test and traing dataset by randomly excluding some of the rating items
# from icedata


# class(icedata)

icemat <- as(as.matrix(icedata), "realRatingMatrix")
# class(icemat)

icer <- nrow(icemat)
icec <- ncol(icemat)
rownames(icemat) <- icedataorig$Name
# colnames(icemat)

Helper Functions

Function : Print Matrix

We use this function to print the matrix elements It takes as input matrix whose elements are to be printed.

# Func tion to print matrix @param1 matrix


printmat <- function(matrixA) {
    
    # dimension of matrix
    dimrow <- nrow(matrixA)
    dimcol <- ncol(matrixA)
    
    # Looping through the matrixA
    for (i in 1:dimrow) {
        for (j in 1:dimcol) {
            cat(" ", matrixA[i, j], " ")
        }  # end of inner for loop
        
        cat("\n")  # Begin on next line after every row of matrix printed 
        
    }  # end of outer for loop
    
    
    
}

Function : Dimensionality Reduction Threshold k

findk <- function(matrixA)
{
    

    # dimension of matrix
    dimrow <- nrow(matrixA)
    dimcol <- ncol(matrixA)
    
    sumsq <- 0
    
    k <- 0
     
     # Looping through the matrixA for calculating Sum of Squares 
     # of diagonal elements of matrix Sigma
       for(i in 1:dimrow)
     {
            for(j in 1:dimcol)
            {
                 if (i == j)  # if it is a diagonal element, as this function would be  
                 {            # called for diagonal matrix Sigma
                      # Square the diagonal elements and sum them
                      sumsq <- sumsq + (as.numeric(matrixA[i,j])) *  (as.numeric(matrixA[i,j]))
                 }
            } # end of inner for loop
       } # end of outer for loop
       
     
     ninetysumsq <- .9 * sumsq
     
     newsumsq <- 0
     
     # Looping through the matrixA again for calculating 90 % Sum of Squares 
     # of diagonal elements of matrix Sigma and thereby the k th value.
     
     for(i in 1:dimrow)
     {
            for(j in 1:dimcol)
            {
                 if ((i == j))
                 {
                      
                      newsumsq <- newsumsq + (as.numeric(matrixA[i,j])) * (as.numeric(matrixA[i,j]))
                      if((ninetysumsq < newsumsq ))
                      {
                        k <- i # return the value of i , at the first instance when 90% of sum of                                 # squares value is reached.
                       return (k)
                      }
                 }
            } # end of inner for loop
       } # end of outer for loop
       
     return (k)
     
}

Function : Calculate Frobenius Norm*

calcFN <- function(matrixA, matrixAnew) {
    # dimension of matrix
    dimrow <- nrow(matrixA)
    dimcol <- ncol(matrixA)
    
    elemsqtot <- 0
    
    # Looping through the matrixA
    for (i in 1:dimrow) {
        for (j in 1:dimcol) {
            elemdiff <- matrixA[i, j] - matrixAnew[i, j]  # Difference in elements
            elemdiffsq <- elemdiff^2  # Square the difference
            elemsqtot <- elemsqtot + elemdiffsq  # Add the difference
            
        }  # end of inner for loop
        
    }  # end of outer for loop
    
    return(sqrt(elemsqtot))
    
}

Performing SVD

We perform SVD on the IceCream dataset, by breaking the $ m * n $ matrix \(A\) into \(m * k\) matrix \(U\) and a \(k * n\) matrix \(V\)

\[A = U \ \Sigma \ V^T\]

We start by first normalizing the dataset. We then input this to the svd function and gather the matrices \(U\), \(\Sigma\), \(V\).

\(\Sigma\) is a diagonal matrix. The SVD involves computational overhead. Hence for larger datasets, one can overcome the computational overhead by reducing the dimensions. For this one needs to determine , the number of singular values \(k\).

# We normalize the data

ice_norm <- normalize(icemat)
# ice_norm <- data.frame(scale(as.data.frame(icemat), center=T, scale=T))

system.time(ice_svd <- svd(ice_norm@data))
##    user  system elapsed 
##    0.02    0.00    0.02
summary(ice_svd)
##   Length Class  Mode   
## d  14    -none- numeric
## u 196    -none- numeric
## v 350    -none- numeric
# class(ice_norm@data)

# We perform Singular Value Decomposition on the data matrix And retrieve the
# singular matrices,

# Diagonal MAtrix Sigma
Sigma <- ice_svd$d
Sigma.mat <- Sigma %>% diag()
dim(Sigma.mat)
## [1] 14 14
# Left Singular MAtrix U
U <- ice_svd$u
dim(U)
## [1] 14 14
# Right Singular MAtrix V, derived from V Transpose obtained from svd function
V <- t(as.matrix(ice_svd$v))
dim(V)
## [1] 14 25
dim(ice_svd$v)
## [1] 25 14
# Printing matrix
kable(U)
-0.3923823 -0.2689898 0.3114195 0.0674656 0.1642165 -0.1724830 0.0641726 -0.4746405 -0.1944941 0.2473266 -0.4271682 0.2426606 0.1671333 -0.1331874
0.3438589 0.0252036 0.3549936 0.2532374 0.1090052 0.4584528 0.1351380 0.1483983 -0.0942775 0.1961397 0.0982189 0.1814286 -0.0543615 -0.5794422
0.0004417 -0.2660590 0.5655401 0.0752094 -0.1356445 0.1420170 -0.1032534 0.3749034 -0.3127142 -0.1061332 0.0119814 -0.0993470 0.2320973 0.4909189
-0.0865337 -0.0630971 0.1114194 0.1690805 0.1850025 0.0424550 -0.2120405 0.3130683 0.5297614 -0.1904095 -0.4129068 0.3749804 -0.3560202 0.1173511
0.4997070 -0.2442204 0.0260934 -0.4994113 0.5644148 -0.2902826 -0.0989984 0.0912354 -0.1111143 0.0071214 -0.0385234 0.0670163 0.0342116 -0.0078381
-0.3340064 -0.2504174 0.1485239 -0.0264554 0.0374570 -0.2226934 -0.1490759 0.1015351 0.0452996 -0.5929476 0.3049770 -0.1044787 0.0849020 -0.5046476
-0.1702662 -0.1241750 0.1465098 0.0847870 0.2947402 -0.0089155 0.0107677 -0.0693618 0.5094621 0.3876492 0.5855713 0.0373579 0.2088514 0.1931996
-0.1024088 -0.1436219 0.3114212 -0.2565122 -0.0653757 0.0088247 0.2164890 -0.1824433 -0.0489538 0.0577872 0.1816471 -0.2021202 -0.7972602 0.0799421
0.0329223 0.0968066 -0.0756537 0.0699336 0.0755026 0.0533243 0.1964330 -0.2195338 -0.2668817 -0.3736656 0.3447680 0.6973187 -0.0592815 0.2532208
-0.4028151 -0.3469868 -0.4306506 -0.1347259 0.1282051 0.1947969 0.3251603 0.4949102 -0.2174354 0.2105274 0.0036635 0.1111257 -0.0476290 -0.0495041
0.1257570 0.0675686 0.1499769 0.0813812 -0.3832666 -0.6856511 0.3269447 0.3247960 0.0562865 0.2124600 0.0474604 0.2451836 -0.0072521 -0.1018830
0.1967571 -0.5407650 -0.1332740 -0.2500517 -0.5595062 0.1747635 -0.3000743 -0.1677830 0.1672723 0.1058007 0.0600107 0.2771293 0.0384852 -0.0756966
0.1641855 -0.1812715 0.0644289 -0.1289578 -0.0439907 0.1680744 0.7073118 -0.1270306 0.3741335 -0.3236668 -0.2031538 -0.1391901 0.2531672 0.0575897
0.2733762 -0.4839612 -0.2607096 0.6820227 0.1157575 -0.1854392 0.0380380 -0.1203687 -0.0941904 -0.0504859 0.0292699 -0.2125379 -0.1716394 0.0853792
kable(V)
-0.2116509 -0.4720482 -0.1355550 -0.3838004 0.0815399 0.0992064 0.0786392 0.2415555 0.0007457 -0.1545443 0.1902624 0.4173188 0.2744145 -0.0207393 0.0569131 -0.2616642 -0.0562206 -0.0563446 0.0336170 0.0800306 -0.1375198 0.0902333 -0.0532071 0.0611162 0.2377019
0.0403952 -0.1269263 0.1087283 -0.1527594 -0.3837648 -0.1485109 -0.2298974 0.0383077 -0.0076957 -0.2151676 -0.0664195 -0.1189332 0.1241617 -0.1282917 0.0894036 0.4949442 0.1616518 0.1000614 0.0388943 -0.1915446 0.1261421 -0.0521389 -0.0275865 -0.0021639 0.5291098
0.0533646 -0.2399554 -0.0305155 -0.3177222 0.1118383 0.2035947 -0.0220309 0.3756677 0.0050333 -0.1743710 0.2266149 -0.3216086 -0.3516145 0.0942560 0.0942079 0.1456760 0.1032542 0.1960755 -0.0316456 -0.0754509 0.3126309 0.0447018 -0.0184507 -0.0116909 -0.3718596
0.2677399 0.0522844 0.2242450 -0.1546247 0.4224606 -0.0487786 -0.2861402 -0.0360086 -0.0137682 -0.0819716 -0.1037967 0.2944554 -0.2589258 -0.0211656 0.0523338 0.1012364 -0.2738669 -0.3427577 0.0797033 0.1557917 0.2860033 -0.1862749 -0.1794388 -0.0955678 0.1468322
-0.2202700 -0.2208966 -0.2981976 0.2969200 0.0877535 -0.1333567 0.0803460 0.0886409 -0.0257229 0.2155898 -0.1599810 0.0277849 -0.0851082 -0.0757983 -0.0361396 0.0093194 -0.0610769 -0.1068590 -0.4286155 -0.0618471 0.5290475 0.0784932 0.0506864 0.2779352 0.1713530
0.0822766 -0.2170682 0.1490381 -0.3502022 0.0328628 0.2777867 0.0648176 -0.0735310 -0.0105033 0.4504304 -0.4543723 0.0821999 0.0130421 -0.0180909 -0.0992804 0.1065454 0.0911167 0.1196568 -0.1661842 -0.0124809 -0.0668208 -0.2214577 0.3890228 -0.1391831 -0.0296209
-0.0808608 0.0221487 0.0091569 -0.0257836 0.1608772 0.4056833 0.4718034 -0.2547322 -0.0502665 -0.2567336 -0.1575523 -0.1422013 0.1129832 -0.3612075 -0.2820636 0.1762449 -0.0493413 0.0282164 0.1309535 0.0961282 0.1623430 0.1104156 -0.2736798 -0.0132772 0.0607454
-0.2040754 0.0058303 -0.1457322 -0.1595792 -0.2709357 -0.0389859 0.0419249 -0.2976150 0.0664516 0.2745784 0.1425532 0.4191953 -0.2669675 0.2380680 -0.2325605 0.1832640 0.0514216 0.1140388 0.3231693 -0.1449570 0.1876235 0.0106807 -0.2420929 0.0680836 -0.1233818
0.0227752 -0.1628874 -0.1538414 0.1073734 -0.0063805 0.1898852 0.1774362 -0.0017617 0.0470839 0.1788164 0.1308980 -0.3087915 -0.4907820 0.0781067 0.0433962 -0.1240585 0.1034825 -0.2231092 0.2364414 0.0951921 -0.2362044 -0.2282085 0.0037757 0.0485477 0.4728146
0.1360036 -0.2826412 0.0513308 -0.0643364 0.2872982 -0.2026910 -0.1301707 -0.1912605 0.1077000 0.4471332 0.1827777 -0.3777083 0.3628789 0.0323303 -0.0601047 -0.0764673 0.0291814 0.0417562 0.0943810 -0.1046279 0.0803268 0.1060829 -0.3474577 -0.1599010 0.0381857
-0.5204134 -0.1455408 0.3898580 0.2292069 0.0977483 -0.0525343 -0.1665170 0.1313682 -0.2544449 0.0932595 -0.1233813 -0.0241485 -0.1344614 -0.2066215 0.0735282 -0.0971371 -0.0418008 0.1317595 0.4325917 -0.0281597 0.0702580 0.1534625 0.1522181 -0.1500246 -0.0100737
0.3737053 -0.0905392 0.0253982 -0.0575745 0.0013495 -0.0923227 0.0002408 -0.1919624 -0.4276546 -0.1256781 -0.0304618 0.0205069 0.0508921 -0.0897387 -0.0638289 -0.3615364 0.3274240 0.0006506 0.2522608 -0.1665022 0.2071551 -0.0947115 0.1632944 0.4213819 -0.0517487
-0.2694045 0.0694647 -0.0531575 0.1053230 0.3080338 0.0590671 -0.1029106 -0.2804767 0.0068029 -0.1846887 0.3896221 0.1027571 0.0489781 -0.0248149 -0.0736501 -0.0012365 0.3683853 0.1228507 -0.2025811 -0.0733431 0.1088060 -0.4095119 0.2040932 -0.3013271 0.0829187
-0.1623790 0.2691994 0.0643711 -0.2682452 0.0859315 -0.1345590 0.0964198 -0.0220599 -0.4403496 0.1302563 0.1302632 -0.1388511 -0.0176427 0.1092403 0.0736555 0.0015444 -0.3169176 0.4355652 -0.1696854 0.2156551 -0.0470057 -0.2118294 -0.1214333 0.2460584 0.1927978
kable(diag(Sigma))
7.033114 0.000000 0.000000 0.000000 0.00000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.0000000 0.0000000
0.000000 5.522588 0.000000 0.000000 0.00000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.0000000 0.0000000
0.000000 0.000000 5.086775 0.000000 0.00000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.0000000 0.0000000
0.000000 0.000000 0.000000 4.179717 0.00000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.0000000 0.0000000
0.000000 0.000000 0.000000 0.000000 3.63069 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.0000000 0.0000000
0.000000 0.000000 0.000000 0.000000 0.00000 3.100181 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.0000000 0.0000000
0.000000 0.000000 0.000000 0.000000 0.00000 0.000000 2.519758 0.000000 0.000000 0.000000 0.000000 0.000000 0.0000000 0.0000000
0.000000 0.000000 0.000000 0.000000 0.00000 0.000000 0.000000 2.262914 0.000000 0.000000 0.000000 0.000000 0.0000000 0.0000000
0.000000 0.000000 0.000000 0.000000 0.00000 0.000000 0.000000 0.000000 1.942386 0.000000 0.000000 0.000000 0.0000000 0.0000000
0.000000 0.000000 0.000000 0.000000 0.00000 0.000000 0.000000 0.000000 0.000000 1.654121 0.000000 0.000000 0.0000000 0.0000000
0.000000 0.000000 0.000000 0.000000 0.00000 0.000000 0.000000 0.000000 0.000000 0.000000 1.216317 0.000000 0.0000000 0.0000000
0.000000 0.000000 0.000000 0.000000 0.00000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.051072 0.0000000 0.0000000
0.000000 0.000000 0.000000 0.000000 0.00000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.8222338 0.0000000
0.000000 0.000000 0.000000 0.000000 0.00000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.0000000 0.4334364

Dimensionality Reduction

Here we identify the \(k\) , through the function findk(). We reduce the dimensions of matrices \(U\), \(V\) to \(m * k\) and \(k * n\) respectively. We reduces the diagonal matrix \(\Sigma\) to be of \(k * k\).

We then compute the \[A = U_k \ \Sigma_k \ V^T_k\]

Modifying /Reducing the dimensions of U and V and Sigma matrices accordingly,

# Find the value for k , so as to know how many singular values to keep
k <- findk(Sigma.mat)
k
## [1] 7
# Modifying /Reducing the dimensions of U and V and Sigma matrices accordingly

# This matrix should be m x k.
U.dr <- U[, 1:k]
dim(U.dr)
## [1] 14  7
# This matrix should be k x n.
V.dr <- V[1:k, ]
dim(V.dr)
## [1]  7 25
# The new Singular diagonal matrix Sigma.dr

# Reducing the Sigma matrix
Sigma.dr <- Sigma.mat[1:k, 1:k]
dim(Sigma.dr)
## [1] 7 7
# Check sum(Sigma.dr^2)/sum(Sigma.mat^2) #0.9


predicted <- U.dr %*% Sigma.dr %*% V.dr
dim(predicted)
## [1] 14 25
colnames(predicted) <- colnames(icemat)
rownames(predicted) <- rownames(icemat)

Matrices Comparison

Predicted

kable(predicted)
NuttyButterScotch BlackForest TuttiFruitiRainbow Casata DiveInChocolate VeryBerryStrawberry Coconut MochaShot RasberrySwirl BlueberryCheesecake MangoDelight CreamyVanilla OrangeVanilla NuttyExpresso Pistachio RoseNNuts RumNRaisin IrishCoffee GoingBananas Neapolitan ChocolateAlmond Kulfi Falooda IcyBrownie PeanutButterCream
Molly 0.4957138 1.1138247 -0.0285426 1.0993001 0.7021224 0.0931510 0.0984459 -0.0875925 -0.0043828 0.2929541 0.0254282 -1.4518013 -1.6111994 0.2972324 -0.1399465 0.2232553 -0.0917628 0.0976707 -0.3237158 -0.0265636 1.1454271 0.0297983 -0.1140685 0.0270739 -1.8618220
Adi -0.1242479 -1.9256286 -0.0335139 -2.0758502 0.9291108 1.0154430 0.1001094 1.0735986 -0.0469768 -0.1672450 -0.0125211 0.8030109 -0.2047468 -0.0989179 0.1241383 0.0214287 -0.1283580 0.0063197 -0.2473033 0.1860545 0.7219301 -0.1515724 0.1240183 -0.0671073 0.1788270
Tom 0.3434133 -0.4813762 0.0326199 -1.0330661 0.9480636 0.8713140 0.0509312 1.0041020 0.0425833 -0.0527856 0.6372032 -0.5970791 -1.2562316 0.5762799 0.2037519 -0.2806844 0.0562862 0.4070866 -0.0191787 0.1135636 0.4712987 -0.0179564 0.1484031 -0.2550084 -1.9135340
Pinky 0.2402741 0.0436920 0.0002206 0.1645835 0.4233569 -0.1974740 -0.3720358 0.2130701 0.0035472 0.3535397 -0.1206481 -0.0812614 -0.7083578 0.2354856 0.1379500 0.0670069 -0.1598057 -0.2027794 -0.3753566 -0.0091699 0.6788697 -0.1784736 0.1361990 0.0648542 -0.3572875
Dino -1.8553780 -1.8916542 -1.8426706 0.0677996 0.0472875 0.0533417 1.1694333 1.2336668 0.0116875 -0.0045023 1.1253711 0.9881765 1.0764533 0.1078915 0.0683513 -1.8998988 -0.0250872 0.0748201 -0.8663221 0.0649095 -0.1051819 1.1148109 0.0443663 1.1137631 0.1285650
Peter 0.3956531 1.2088554 -0.0266712 1.1817434 0.3057780 -0.2307576 -0.0628477 -0.1741094 0.0368514 0.3527173 0.1787148 -1.0909400 -1.1163736 0.4375555 -0.0223674 -0.1094365 -0.0358740 0.0722755 -0.1582291 -0.0332373 0.4102683 0.0365990 0.0101613 0.0000348 -1.5663642
Dolly 0.1202231 0.2622363 -0.1784831 0.5994927 0.4959547 -0.0218882 0.0426513 0.0416533 -0.0253434 0.3849095 -0.2131010 -0.5299086 -0.8559326 0.0851455 -0.0842899 0.1301800 -0.1328506 -0.0933795 -0.5127870 -0.0287322 0.9849143 -0.0118848 0.0413543 0.1866246 -0.6867588
Nina -0.0716939 0.0630656 -0.1975177 -0.0340581 0.0377139 0.6817114 0.6376625 0.2673398 0.0066998 -0.0852636 0.3255147 -1.1133279 -0.4933344 0.1091583 -0.1667794 0.0149658 0.3595448 0.6833429 -0.0220227 -0.1254953 0.1487634 0.2824336 0.0726799 -0.0353807 -1.3457224
Betty -0.0564896 -0.1550083 0.0514590 -0.0827253 0.0032745 0.0611296 0.0864046 -0.1926237 -0.0435709 -0.1011864 -0.3059562 0.1937009 0.2242959 -0.3184030 -0.1259124 0.2849551 -0.0724845 -0.1307423 -0.0160969 0.0192637 0.2132183 -0.0390744 -0.1296740 0.0361673 0.5960791
Polly 0.1354446 1.8610043 0.0749577 2.0687260 0.2140026 0.0230686 0.8902505 -1.7722688 -0.0501392 1.4403098 -1.3276740 -0.4695999 -0.0384648 -0.2321180 -0.8762820 -0.3701839 -0.2362947 -0.2229925 -0.3379040 0.2603103 -0.3591157 -0.1554902 0.3793578 -0.0551325 -0.8437723
Sunny 0.0246649 0.1568893 0.0793662 -0.3814525 0.0985059 0.1003614 0.0087279 0.3253905 0.0136539 -1.8468440 1.3397283 -0.1510122 0.1165062 -0.1551480 0.2023249 0.0045784 -0.1531276 -0.0619057 1.1047739 0.1865239 -0.1991737 0.4835712 -1.2553838 -0.0900164 0.0484970
Kitty -0.1763550 0.1482313 -0.0463909 -0.4713096 0.4594695 0.6084178 0.6245527 -0.0244756 0.1195620 0.6328198 0.6144308 0.9385718 0.6124524 0.7299442 -0.0739067 -2.1392394 -0.1335825 0.1090767 0.5501496 0.6172961 -2.3115309 0.0820314 0.5235453 -0.4311223 -1.5626386
Toto -0.4777268 -0.5632003 -0.2546425 -0.5868999 0.5771150 1.2453207 1.3297812 -0.1233572 -0.0733162 -0.2333166 -0.0755909 0.1217939 0.4386593 -0.4943015 -0.5697824 -0.4363181 -0.0759976 0.2134809 0.1618075 0.3501659 -0.1667549 0.3403041 -0.2363097 -0.0201576 -0.3907562
Riya 0.0299367 -0.0670192 -0.0816443 -0.0254262 2.2718542 0.0017561 0.0209022 -0.1836866 -0.0335118 0.0825653 0.1257948 2.3369405 -0.1084834 0.0616002 -0.0904220 -1.7709883 -1.5405601 -1.7238816 0.1578078 1.2003844 0.0754699 -0.1065592 -0.7441967 0.0619169 0.0494508
# class(predicted)
predicted <- as(predicted, "matrix")

Original normalized

kable(as.matrix(ice_norm@data))
NuttyButterScotch BlackForest TuttiFruitiRainbow Casata DiveInChocolate VeryBerryStrawberry Coconut MochaShot RasberrySwirl BlueberryCheesecake MangoDelight CreamyVanilla OrangeVanilla NuttyExpresso Pistachio RoseNNuts RumNRaisin IrishCoffee GoingBananas Neapolitan ChocolateAlmond Kulfi Falooda IcyBrownie PeanutButterCream
Molly 1.1000000 1.1000000 0.0000000 1.1000000 1.1000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 -1.9000000 -0.9000000 0.1000000 0.0000000 0.0000000 0.0000000 0.0000000 -0.9000000 0.0000000 1.1000000 0.0000000 0.0000000 0.0000000 -1.9000000
Adi -0.0909091 -2.0909091 0.0000000 -2.0909091 0.9090909 0.9090909 0.0000000 0.9090909 0.0000000 0.0000000 0.0000000 0.9090909 -0.0909091 -0.0909091 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.9090909 0.0000000 0.0000000 -0.0909091 0.0000000
Tom 0.0000000 -0.2500000 0.0000000 -1.2500000 0.7500000 0.7500000 0.0000000 0.7500000 0.0000000 0.0000000 0.7500000 0.0000000 -1.2500000 0.7500000 0.0000000 0.0000000 0.0000000 0.7500000 0.0000000 0.0000000 0.7500000 0.0000000 0.0000000 -0.2500000 -2.2500000
Pinky 0.5555556 0.0000000 -0.4444444 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.5555556 0.0000000 0.0000000 -1.4444444 0.5555556 0.0000000 0.0000000 0.0000000 -0.4444444 0.0000000 0.0000000 0.5555556 -0.4444444 0.0000000 0.5555556 0.0000000
Dino -1.8571429 -1.8571429 -1.8571429 0.0000000 0.0000000 0.0000000 1.1428571 1.1428571 0.0000000 0.0000000 1.1428571 1.1428571 1.1428571 0.1428571 0.0000000 -1.8571429 0.0000000 0.1428571 -0.8571429 0.0000000 0.0000000 1.1428571 0.0000000 1.1428571 0.0000000
Peter 0.0000000 1.3750000 0.0000000 1.3750000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 -0.6250000 -1.6250000 0.3750000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.3750000 0.0000000 0.3750000 0.0000000 -1.6250000
Dolly -0.1538462 -0.1538462 0.0000000 0.8461538 0.8461538 0.0000000 0.0000000 0.0000000 -0.1538462 0.8461538 0.0000000 -1.1538462 -1.1538462 0.0000000 0.0000000 -0.1538462 0.0000000 -0.1538462 0.0000000 0.0000000 0.8461538 -0.1538462 0.0000000 0.0000000 -0.1538462
Nina 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.6250000 0.6250000 0.6250000 0.0000000 0.0000000 0.0000000 -1.3750000 -0.3750000 0.0000000 0.0000000 0.0000000 0.0000000 0.6250000 0.0000000 0.0000000 0.0000000 0.6250000 0.0000000 0.0000000 -1.3750000
Betty 0.0000000 0.0000000 0.3636364 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 -0.6363636 -0.6363636 -0.6363636 0.3636364 0.3636364 -0.6363636 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.3636364 0.0000000 0.3636364 0.3636364 0.3636364
Polly 0.0000000 1.8181818 0.0000000 1.8181818 0.0000000 -0.1818182 0.8181818 -2.1818182 0.0000000 1.8181818 -1.1818182 0.0000000 0.0000000 0.0000000 -1.1818182 -0.1818182 -0.1818182 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 -1.1818182
Sunny 0.0000000 0.0000000 0.0000000 -0.5000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 -1.5000000 1.5000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 1.5000000 0.0000000 0.0000000 0.5000000 -1.5000000 0.0000000 0.0000000
Kitty 0.0000000 0.0000000 0.0000000 -0.3750000 0.6250000 0.6250000 0.6250000 0.0000000 0.0000000 0.6250000 0.6250000 0.6250000 0.6250000 0.6250000 0.0000000 -2.3750000 0.0000000 0.0000000 0.6250000 0.6250000 -2.3750000 0.0000000 0.6250000 -0.3750000 -1.3750000
Toto -0.4615385 -0.4615385 -0.4615385 -0.4615385 0.5384615 1.5384615 1.5384615 0.0000000 0.0000000 -0.4615385 0.0000000 0.0000000 0.0000000 -0.4615385 -0.4615385 -0.4615385 0.0000000 0.0000000 0.0000000 0.5384615 -0.4615385 0.0000000 0.0000000 0.0000000 0.0000000
Riya 0.0000000 0.0000000 0.0000000 0.0000000 2.2857143 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 2.2857143 0.0000000 0.0000000 0.0000000 -1.7142857 -1.7142857 -1.7142857 0.0000000 1.2857143 0.0000000 0.0000000 -0.7142857 0.0000000 0.0000000

Original Ratings

kable(as.matrix(icemat@data))
NuttyButterScotch BlackForest TuttiFruitiRainbow Casata DiveInChocolate VeryBerryStrawberry Coconut MochaShot RasberrySwirl BlueberryCheesecake MangoDelight CreamyVanilla OrangeVanilla NuttyExpresso Pistachio RoseNNuts RumNRaisin IrishCoffee GoingBananas Neapolitan ChocolateAlmond Kulfi Falooda IcyBrownie PeanutButterCream
Molly 4 4 0 4 4 0 0 0 0 0 0 1 2 3 0 0 0 0 2 0 4 0 0 0 1
Adi 3 1 0 1 4 4 0 4 0 0 0 4 3 3 0 0 0 0 0 0 4 0 0 3 0
Tom 0 3 0 2 4 4 0 4 0 0 4 0 2 4 0 0 0 4 0 0 4 0 0 3 1
Pinky 4 0 3 0 0 0 0 0 0 4 0 0 2 4 0 0 0 3 0 0 4 3 0 4 0
Dino 1 1 1 0 0 0 4 4 0 0 4 4 4 3 0 1 0 3 2 0 0 4 0 4 0
Peter 0 4 0 4 0 0 0 0 0 0 0 2 1 3 0 0 0 0 0 0 3 0 3 0 1
Dolly 4 4 0 5 5 0 0 0 4 5 0 3 3 0 0 4 0 4 0 0 5 4 0 0 4
Nina 0 0 0 0 0 4 4 4 0 0 0 2 3 0 0 0 0 4 0 0 0 4 0 0 2
Betty 0 0 4 0 0 0 0 0 3 3 3 4 4 3 0 0 0 0 0 0 4 0 4 4 4
Polly 0 5 0 5 0 3 4 1 0 5 2 0 0 0 2 3 3 0 0 0 0 0 0 0 2
Sunny 0 0 0 3 0 0 0 0 0 2 5 0 0 0 0 0 0 0 5 0 0 4 2 0 0
Kitty 0 0 0 4 5 5 5 0 0 5 5 5 5 5 0 2 0 0 5 5 2 0 5 4 3
Toto 1 1 1 1 2 3 3 0 0 1 0 0 0 1 1 1 0 0 0 2 1 0 0 0 0
Riya 0 0 0 0 5 0 0 0 0 0 0 5 0 0 0 1 1 1 0 4 0 0 2 0 0

Calculate Frobenius Norm

icematx <- as.matrix(icemat@data)


calcFN(icematx, predicted)
## [1] 40.42907

Calculate RMSE

calcRMSE <- function(predictedM, origM) {
    return(sqrt(mean((predictedM - origM)^2, na.rm = T)))
}

calcRMSE(predicted, icematx)
## [1] 2.161025

Prediction for new kid ratings

# Querying for recommendations for a new kid on the block
icecreamflav <- colnames(predicted)
noofflav <- length(icecreamflav)

querynew <- rep(0, noofflav)
querynew[which(icecreamflav == "DiveInChocolate")] <- 5
querynew[which(icecreamflav == "NuttyExpresso")] <- 4
querynew[which(icecreamflav == "MochaShot")] <- 4
querynew[which(icecreamflav == "IrishCoffee")] <- 5

# Performing qV for concept

qvconcept <- querynew %*% t(V.dr)

# To get the recommendations

recom <- colMeans(icematx) + qvconcept %*% V.dr

colnames(recom) <- colnames(predicted)

recom
##      NuttyButterScotch BlackForest TuttiFruitiRainbow    Casata
## [1,]           1.32186     0.47073          0.3047217 0.7304399
##      DiveInChocolate VeryBerryStrawberry Coconut MochaShot RasberrySwirl
## [1,]         3.05526            2.194938 1.09867  3.022055     0.6026057
##      BlueberryCheesecake MangoDelight CreamyVanilla OrangeVanilla
## [1,]            1.960963     2.776524      1.973125     0.7187129
##      NuttyExpresso Pistachio   RoseNNuts RumNRaisin IrishCoffee
## [1,]      3.141949  0.834258 0.002323369  0.3617337    1.743826
##      GoingBananas Neapolitan ChocolateAlmond   Kulfi  Falooda IcyBrownie
## [1,]    0.6243597  0.8274436         2.67238 1.40332 1.612043   1.533508
##      PeanutButterCream
## [1,]        -0.7734641
# predictedRRM <- as(predicted, 'realRatingMatrix') calcPredictionAccuracy(x =
# predictedRRM, data = ice_norm, byUser = FALSE)

Predictionfor new kid ratings

The new kid with preferences for DiveInChocolate , NuttyExpresso, MochaShot ,IrishCoffee

0, 0, 0, 0, 5, 0, 0, 4, 0, 0, 0, 0, 0, 4, 0, 0, 0, 5, 0, 0, 0, 0, 0, 0, 0

would like the flavors to this extent

NuttyButterScotch BlackForest TuttiFruitiRainbow Casata DiveInChocolate VeryBerryStrawberry Coconut MochaShot RasberrySwirl BlueberryCheesecake MangoDelight CreamyVanilla OrangeVanilla NuttyExpresso Pistachio RoseNNuts RumNRaisin IrishCoffee GoingBananas Neapolitan ChocolateAlmond Kulfi Falooda IcyBrownie PeanutButterCream
1.32186 0.47073 0.3047217 0.7304399 3.05526 2.194938 1.09867 3.022054 0.6026057 1.960963 2.776524 1.973125 0.7187129 3.141949 0.834258 0.0023234 0.3617337 1.743826 0.6243597 0.8274436 2.67238 1.40332 1.612043 1.533508 -0.7734641

Using irlba

library(irlba)
min(ncol(ice_norm@data))
## [1] 25
system.time(ice_svdirlba <- irlba::irlba(ice_norm@data, nu = 6, nv = 6))
##    user  system elapsed 
##    0.02    0.00    0.01
summary(ice_svdirlba)
##       Length Class  Mode   
## d       6    -none- numeric
## u      84    -none- numeric
## v     150    -none- numeric
## iter    1    -none- numeric
## mprod   1    -none- numeric
Sigma1 <- ice_svdirlba$d
dim(Sigma1)
## NULL
U1 <- ice_svdirlba$u
dim(U1)
## [1] 14  6
V1 <- t(as.matrix(ice_svdirlba$v))
dim(V1)
## [1]  6 25
# Printing matrix printmat(U1) printmat(V1) print(Sigma1)

# Checking if if we get exact same decomposition through different packages,
# methods
identical(Sigma, Sigma1)
## [1] FALSE

We find that the U, V obtained are different in the sense not exactly matching element to element through irlba and through svd functions, but are more or less similar.


Decomposed matrices

SVD

pander(U.dr, caption = "Decomposed matrices with SVD ")
Decomposed matrices with SVD
-0.3924 -0.269 0.3114 0.06747 0.1642 -0.1725 0.06417
0.3439 0.0252 0.355 0.2532 0.109 0.4585 0.1351
0.0004417 -0.2661 0.5655 0.07521 -0.1356 0.142 -0.1033
-0.08653 -0.0631 0.1114 0.1691 0.185 0.04246 -0.212
0.4997 -0.2442 0.02609 -0.4994 0.5644 -0.2903 -0.099
-0.334 -0.2504 0.1485 -0.02646 0.03746 -0.2227 -0.1491
-0.1703 -0.1242 0.1465 0.08479 0.2947 -0.008915 0.01077
-0.1024 -0.1436 0.3114 -0.2565 -0.06538 0.008825 0.2165
0.03292 0.09681 -0.07565 0.06993 0.0755 0.05332 0.1964
-0.4028 -0.347 -0.4307 -0.1347 0.1282 0.1948 0.3252
0.1258 0.06757 0.15 0.08138 -0.3833 -0.6857 0.3269
0.1968 -0.5408 -0.1333 -0.2501 -0.5595 0.1748 -0.3001
0.1642 -0.1813 0.06443 -0.129 -0.04399 0.1681 0.7073
0.2734 -0.484 -0.2607 0.682 0.1158 -0.1854 0.03804
pander(V.dr, caption = "Decomposed matrices with SVD ")
Decomposed matrices with SVD (continued below)
-0.2117 -0.472 -0.1356 -0.3838 0.08154 0.09921 0.07864 0.2416 0.0007457
0.0404 -0.1269 0.1087 -0.1528 -0.3838 -0.1485 -0.2299 0.03831 -0.007696
0.05336 -0.24 -0.03052 -0.3177 0.1118 0.2036 -0.02203 0.3757 0.005033
0.2677 0.05228 0.2242 -0.1546 0.4225 -0.04878 -0.2861 -0.03601 -0.01377
-0.2203 -0.2209 -0.2982 0.2969 0.08775 -0.1334 0.08035 0.08864 -0.02572
0.08228 -0.2171 0.149 -0.3502 0.03286 0.2778 0.06482 -0.07353 -0.0105
-0.08086 0.02215 0.009157 -0.02578 0.1609 0.4057 0.4718 -0.2547 -0.05027
Table continues below
-0.1545 0.1903 0.4173 0.2744 -0.02074 0.05691 -0.2617 -0.05622 -0.05634
-0.2152 -0.06642 -0.1189 0.1242 -0.1283 0.0894 0.4949 0.1617 0.1001
-0.1744 0.2266 -0.3216 -0.3516 0.09426 0.09421 0.1457 0.1033 0.1961
-0.08197 -0.1038 0.2945 -0.2589 -0.02117 0.05233 0.1012 -0.2739 -0.3428
0.2156 -0.16 0.02778 -0.08511 -0.0758 -0.03614 0.009319 -0.06108 -0.1069
0.4504 -0.4544 0.0822 0.01304 -0.01809 -0.09928 0.1065 0.09112 0.1197
-0.2567 -0.1576 -0.1422 0.113 -0.3612 -0.2821 0.1762 -0.04934 0.02822
0.03362 0.08003 -0.1375 0.09023 -0.05321 0.06112 0.2377
0.03889 -0.1915 0.1261 -0.05214 -0.02759 -0.002164 0.5291
-0.03165 -0.07545 0.3126 0.0447 -0.01845 -0.01169 -0.3719
0.0797 0.1558 0.286 -0.1863 -0.1794 -0.09557 0.1468
-0.4286 -0.06185 0.529 0.07849 0.05069 0.2779 0.1714
-0.1662 -0.01248 -0.06682 -0.2215 0.389 -0.1392 -0.02962
0.131 0.09613 0.1623 0.1104 -0.2737 -0.01328 0.06075
pander(Sigma.dr)
7.033 0 0 0 0 0 0
0 5.523 0 0 0 0 0
0 0 5.087 0 0 0 0
0 0 0 4.18 0 0 0
0 0 0 0 3.631 0 0
0 0 0 0 0 3.1 0
0 0 0 0 0 0 2.52

irlba

par(mfrow = c(1, 2))
pander(U1, caption = "Decomposed matrices with SVD ")
Decomposed matrices with SVD
0.3924 0.269 -0.3114 -0.06747 0.1642 -0.1725
-0.3439 -0.0252 -0.355 -0.2532 0.109 0.4585
-0.0004417 0.2661 -0.5655 -0.07521 -0.1356 0.142
0.08653 0.0631 -0.1114 -0.1691 0.185 0.04246
-0.4997 0.2442 -0.02609 0.4994 0.5644 -0.2903
0.334 0.2504 -0.1485 0.02646 0.03746 -0.2227
0.1703 0.1242 -0.1465 -0.08479 0.2947 -0.008916
0.1024 0.1436 -0.3114 0.2565 -0.06538 0.008825
-0.03292 -0.09681 0.07565 -0.06993 0.0755 0.05332
0.4028 0.347 0.4307 0.1347 0.1282 0.1948
-0.1258 -0.06757 -0.15 -0.08138 -0.3833 -0.6857
-0.1968 0.5408 0.1333 0.2501 -0.5595 0.1748
-0.1642 0.1813 -0.06443 0.129 -0.04399 0.1681
-0.2734 0.484 0.2607 -0.682 0.1158 -0.1854
pander(V1, caption = "Decomposed matrices with SVD ")
Decomposed matrices with SVD (continued below)
0.2117 0.472 0.1356 0.3838 -0.08154 -0.09921 -0.07864 -0.2416 -0.0007457
-0.0404 0.1269 -0.1087 0.1528 0.3838 0.1485 0.2299 -0.03831 0.007696
-0.05336 0.24 0.03052 0.3177 -0.1118 -0.2036 0.02203 -0.3757 -0.005033
-0.2677 -0.05228 -0.2242 0.1546 -0.4225 0.04878 0.2861 0.03601 0.01377
-0.2203 -0.2209 -0.2982 0.2969 0.08775 -0.1334 0.08035 0.08864 -0.02572
0.08228 -0.2171 0.149 -0.3502 0.03286 0.2778 0.06482 -0.07353 -0.0105
Table continues below
0.1545 -0.1903 -0.4173 -0.2744 0.02074 -0.05691 0.2617 0.05622 0.05634
0.2152 0.06642 0.1189 -0.1242 0.1283 -0.0894 -0.4949 -0.1617 -0.1001
0.1744 -0.2266 0.3216 0.3516 -0.09426 -0.09421 -0.1457 -0.1033 -0.1961
0.08197 0.1038 -0.2945 0.2589 0.02117 -0.05233 -0.1012 0.2739 0.3428
0.2156 -0.16 0.02778 -0.08511 -0.0758 -0.03614 0.009319 -0.06108 -0.1069
0.4504 -0.4544 0.0822 0.01304 -0.01809 -0.09928 0.1065 0.09112 0.1197
-0.03362 -0.08003 0.1375 -0.09023 0.05321 -0.06112 -0.2377
-0.03889 0.1915 -0.1261 0.05214 0.02759 0.002164 -0.5291
0.03165 0.07545 -0.3126 -0.0447 0.01845 0.01169 0.3719
-0.0797 -0.1558 -0.286 0.1863 0.1794 0.09557 -0.1468
-0.4286 -0.06185 0.529 0.07849 0.05069 0.2779 0.1714
-0.1662 -0.01248 -0.06682 -0.2215 0.389 -0.1392 -0.02962
pander(diag(Sigma1))
7.033 0 0 0 0 0
0 5.523 0 0 0 0
0 0 5.087 0 0 0
0 0 0 4.18 0 0
0 0 0 0 3.631 0
0 0 0 0 0 3.1