1 Overview

The following is in regard to Assignment 8 for IE 4331-EDA for the Spring 2022 semester at Texas Tech University. All analysis comes from the data for the Fresh Food Now case study found here:

https://raw.githubusercontent.com/Garcia-Ry/IE-4331-EDA/main/SKU%20Master.csv

In order to evaluate the data, the corrplot package must be insalled. This can be installed by entering the following code:
install.packages(“corrplot”)

2 Filter Initial Data

Filter the data according to the following:

Consider only those observations for which the Cubic Feet per UOM is greater than zero and less than two, and for which the Weight per UOM is greater than zero and less than 50.  Note also that only the factor levels of Case(CA), Each(EA), Pallet(PL), and Pound(PL) for the Units of Measure (UoM) are admissible, all observations with other designations should be omitted.  In addition, all rows with NA should be dropped.

3 Filter the Dataframe

Filter the dataframe to keep only the variables UnitsPerCase, LeadTime, UoMCube, UoMWeight, and ShelfLifeDays.

After completing the given instruction, the table should look similar to:

4 Create a correlation plot for all the variables

Create a correlation plot for all the variables. Show the numerical correlation on this plot and keep only the upper potion

5 Significance of Correlations

Test for the significance of all correlations useing the pearson method

##               UnitsPerCase LeadTime UomCube UomWeight ShelfLifeDays
## UnitsPerCase          1.00     0.05    0.01      0.04          0.03
## LeadTime              0.05     1.00   -0.01      0.04          0.04
## UomCube               0.01    -0.01    1.00      0.32          0.01
## UomWeight             0.04     0.04    0.32      1.00         -0.06
## ShelfLifeDays         0.03     0.04    0.01     -0.06          1.00
## 
## n= 5158 
## 
## 
## P
##               UnitsPerCase LeadTime UomCube UomWeight ShelfLifeDays
## UnitsPerCase               0.0004   0.3770  0.0020    0.0394       
## LeadTime      0.0004                0.5436  0.0038    0.0074       
## UomCube       0.3770       0.5436           0.0000    0.5140       
## UomWeight     0.0020       0.0038   0.0000            0.0000       
## ShelfLifeDays 0.0394       0.0074   0.5140  0.0000

Overall the only linear correlation of significance is UomCube and UomWeight.

6 UoMWeight and ShelfLifeDays Pearsons Relationship

cor.test(dat2$UomWeight, dat2$ShelfLifeDays)
## 
##  Pearson's product-moment correlation
## 
## data:  dat2$UomWeight and dat2$ShelfLifeDays
## t = -4.1199, df = 5156, p-value = 3.849e-05
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  -0.08444154 -0.03003769
## sample estimates:
##         cor 
## -0.05728214

7 UoMWeight and ShelfLifeDays Spearman’s Relationship

cor.test(dat2$UomWeight, dat2$ShelfLifeDays, method = "spearman")
## 
##  Spearman's rank correlation rho
## 
## data:  dat2$UomWeight and dat2$ShelfLifeDays
## S = 2.3311e+10, p-value = 0.1673
## alternative hypothesis: true rho is not equal to 0
## sample estimates:
##         rho 
## -0.01922875

8 Comment/compare the findings from questions 4 and 5

The tested variables both have a negative correlation, where there is a stronger linear correlation as compared to the monotonic relationship

9 Examine variables UoMWeight and UoMCube for a linear relationship

cor.test(dat2$UomWeight, dat2$UomCube)
## 
##  Pearson's product-moment correlation
## 
## data:  dat2$UomWeight and dat2$UomCube
## t = 24.516, df = 5156, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  0.2984546 0.3473426
## sample estimates:
##       cor 
## 0.3231141

10 Examine variables UoMWeight and UoMCube for a monotonic relationship

cor.test(dat2$UomWeight, dat2$UomCube, method = "spearman")
## 
##  Spearman's rank correlation rho
## 
## data:  dat2$UomWeight and dat2$UomCube
## S = 1.5426e+10, p-value < 2.2e-16
## alternative hypothesis: true rho is not equal to 0
## sample estimates:
##       rho 
## 0.3255434

11 Comment/compare the findings from questions 7 and 8

The tested variables both have a weak positive correlation.

12 Complete R Code

#IE EDA - Assignment 8
# Luis Araiza and Ryan Garcia

###### Creating Data frame from Assignment 3 ######
dat<-read.csv("https://raw.githubusercontent.com/Garcia-Ry/IE-4331-EDA/main/SKU%20Master.csv")
head(dat)
colnames(dat)
colnames(dat)[colnames(dat)=="?..SkuNbr"]<-"SkuNbr" #Rename SkuNbr column 
colnames(dat)

# dat$SkuNbr=as.numeric(dat$SkuNbr) #SkuNbr is character because of some with letters
dat$Flow=as.factor(dat$Flow)
dat$SkuNbr=suppressWarnings(as.numeric(dat$SkuNbr))
dat$Whs=as.factor(dat$Whs)
dat$Uom=as.factor(dat$Uom)
dat$Commodity=as.factor(dat$Commodity)
head(dat)

#Only Consider if UomCube is greater than zero and less than two
dat<-dat[dat$UomCube>0&&dat$UomCube<2,]
(head)dat
#Only Consider if UomWeight is greater than zero and less than fifty
dat<-dat[dat$UomWeight>0&&dat$UomWeight<50,]
head(dat)

#Only consider if Uom is CA, EA, PL, or LB
dat<-dat[dat$Uom=="CA"|dat$Uom=="EA"|dat$Uom=="PL"|dat$Uom=="LB",]
head(dat)
#Omit all NA rows
dat<-na.omit(dat)
head(dat)

dat<-droplevels(dat) #Drops unsued factors from Dataframe
head(dat)

###### Begin New Material for Assignment 8 ######

## 1. Filter the dataframe to keep only the variables UnitsPerCase, LeadTime, 
##    UoMCube, UoMWeight, and ShelfLifeDays
dat2<-dat[,c(6,7,8,9,11)]
head(dat2)

## 2. Create a correlation plot for all the variables.  Show the numerical 
##    correlation on this plot and keep only the upper potion
library(corrplot)
corrplot(cor(dat2),type="upper")

## 3. Test for the significance of all correlations
library(Hmisc)
rcorr(as.matrix(dat2))

## 4. Examine variables UoMWeight and ShelfLifeDays for a linear relationship 
##    (Pearsons) and test for the significance of this correlation.
cor.test(dat2$UomWeight, dat2$ShelfLifeDays)

## 5. Examine variables UoMWeight and ShelfLifeDays for a monotonic relationship 
##    (Spearmans) and test for the significance of this correlation. 
cor.test(dat2$UomWeight, dat2$ShelfLifeDays, method = "spearman")

## 6. Comment/compare the findings from questions 4 and 5
#  The tested variables both have a negative correlation, where there is a 
#  stronger linear correlation as compared to the monotonic relationship 

## 7. Examine variables UoMWeight and UoMCube for a linear relationship 
##    (Pearsons) and test for the significance of this correlation.
cor.test(dat2$UomWeight, dat2$UomCube)

## 8. Examine variables UoMWeight and UoMCube for a monotonic relationship 
##    (Spearmans) and test for the significance of this correlation. 
cor.test(dat2$UomWeight, dat2$UomCube, method = "spearman")

## 9. Comment/compare the findings from questions 7 and 8
#  The tested variables both have a weak positive correlation.