Descriptive Analysis

Calculating the mean and Standard deviations for x and y in all four data sets.

All four data sets have the same mean and standard deviation 9 and 3.31 respectively for x.

All four data sets have the same mean and standard deviation 7.5 and 2.03 respectively for y

require(ggplot2)
## Loading required package: ggplot2
## Warning: package 'ggplot2' was built under R version 3.1.3
library(ggplot2)
library(plot3D)
## Warning: package 'plot3D' was built under R version 3.1.3
library(scatterplot3d)
## Warning: package 'scatterplot3d' was built under R version 3.1.3
# loading the data from the csv files
data1<- read.table(file= "c:/CUNY/project2-sheet1.csv", header = TRUE, sep=",")
data2<- read.table(file= "c:/CUNY/project2-sheet2.csv", header = TRUE, sep=",")
data3<- read.table(file= "c:/CUNY/project2-sheet3.csv", header = TRUE, sep=",")
data4<- read.table(file= "c:/CUNY/project2-sheet4.csv", header = TRUE, sep=",")



str(data1)
## 'data.frame':    11 obs. of  2 variables:
##  $ x: int  10 8 13 9 11 14 6 4 12 7 ...
##  $ y: num  8.04 6.95 7.58 8.81 8.33 ...
str(data2)
## 'data.frame':    11 obs. of  2 variables:
##  $ x: int  10 8 13 9 11 14 6 4 12 7 ...
##  $ y: num  9.14 8.14 8.74 8.77 9.26 8.1 6.13 3.1 9.13 7.26 ...
str(data3)
## 'data.frame':    11 obs. of  2 variables:
##  $ x: int  10 8 13 9 11 14 6 4 12 7 ...
##  $ y: num  7.46 6.77 12.74 7.11 7.81 ...
str(data4)
## 'data.frame':    11 obs. of  2 variables:
##  $ x: int  8 8 8 8 8 8 8 19 8 8 ...
##  $ y: num  6.58 5.76 7.71 8.84 8.47 7.04 5.25 12.5 5.56 7.91 ...
# Calculating the mean, median, min, max, 1st Qu, and 3rd Qu of each data set
summary(data1)
##        x              y         
##  Min.   : 4.0   Min.   : 4.260  
##  1st Qu.: 6.5   1st Qu.: 6.315  
##  Median : 9.0   Median : 7.580  
##  Mean   : 9.0   Mean   : 7.501  
##  3rd Qu.:11.5   3rd Qu.: 8.570  
##  Max.   :14.0   Max.   :10.840
summary(data2)
##        x              y        
##  Min.   : 4.0   Min.   :3.100  
##  1st Qu.: 6.5   1st Qu.:6.695  
##  Median : 9.0   Median :8.140  
##  Mean   : 9.0   Mean   :7.501  
##  3rd Qu.:11.5   3rd Qu.:8.950  
##  Max.   :14.0   Max.   :9.260
summary(data3)
##        x              y        
##  Min.   : 4.0   Min.   : 5.39  
##  1st Qu.: 6.5   1st Qu.: 6.25  
##  Median : 9.0   Median : 7.11  
##  Mean   : 9.0   Mean   : 7.50  
##  3rd Qu.:11.5   3rd Qu.: 7.98  
##  Max.   :14.0   Max.   :12.74
summary(data4)
##        x            y         
##  Min.   : 8   Min.   : 5.250  
##  1st Qu.: 8   1st Qu.: 6.170  
##  Median : 8   Median : 7.040  
##  Mean   : 9   Mean   : 7.501  
##  3rd Qu.: 8   3rd Qu.: 8.190  
##  Max.   :19   Max.   :12.500
# Data set data1
mean(data1$x)
## [1] 9
sd(data1$x)
## [1] 3.316625
mean(data1$y)
## [1] 7.500909
sd(data1$y)
## [1] 2.031568
# Data set data2
mean(data2$x)
## [1] 9
sd(data2$x)
## [1] 3.316625
mean(data2$y)
## [1] 7.500909
sd(data2$y)
## [1] 2.031657
# Data set data3
mean(data3$x)
## [1] 9
sd(data3$x)
## [1] 3.316625
mean(data3$y)
## [1] 7.5
sd(data3$y)
## [1] 2.030424
# Data set data4
mean(data4$x)
## [1] 9
sd(data4$x)
## [1] 3.316625
mean(data4$y)
## [1] 7.500909
sd(data4$y)
## [1] 2.030579

Graphical Analysis

The four histograms shows similar distributions for y given x.

the graphs look very similar. There are some missing data in x for

data2 (bet 5 and 6), data3 (bet 10 and 12), and data4 (bet 10 and 12).

However, the distributions are still the same.

Note that the echo = FALSE parameter was added to the code chunk to prevent printing of the R code that generated the plot.