Task:
1.Load in the mtcars dataset.
2.Use the exploratory functions you learned today to get an initial view.
3. Do histograms of the variables - are they normally distributed? Feel free to play around with the function parameters, to make the plots more beautiful. Hint: check the documentation of the function to check what parameters you can control.
4.Plot several plots next to each other.
5.Do the shapiro test to make sure what you see is right.

1.Load in the mtcars dataset.

library(datasets)
data(mtcars)

2. Use the exploratory functions you learned today to get an initial view.

head(mtcars)
##                    mpg cyl disp  hp drat    wt  qsec vs am gear carb
## Mazda RX4         21.0   6  160 110 3.90 2.620 16.46  0  1    4    4
## Mazda RX4 Wag     21.0   6  160 110 3.90 2.875 17.02  0  1    4    4
## Datsun 710        22.8   4  108  93 3.85 2.320 18.61  1  1    4    1
## Hornet 4 Drive    21.4   6  258 110 3.08 3.215 19.44  1  0    3    1
## Hornet Sportabout 18.7   8  360 175 3.15 3.440 17.02  0  0    3    2
## Valiant           18.1   6  225 105 2.76 3.460 20.22  1  0    3    1
summary(mtcars)
##       mpg             cyl             disp             hp       
##  Min.   :10.40   Min.   :4.000   Min.   : 71.1   Min.   : 52.0  
##  1st Qu.:15.43   1st Qu.:4.000   1st Qu.:120.8   1st Qu.: 96.5  
##  Median :19.20   Median :6.000   Median :196.3   Median :123.0  
##  Mean   :20.09   Mean   :6.188   Mean   :230.7   Mean   :146.7  
##  3rd Qu.:22.80   3rd Qu.:8.000   3rd Qu.:326.0   3rd Qu.:180.0  
##  Max.   :33.90   Max.   :8.000   Max.   :472.0   Max.   :335.0  
##       drat             wt             qsec             vs        
##  Min.   :2.760   Min.   :1.513   Min.   :14.50   Min.   :0.0000  
##  1st Qu.:3.080   1st Qu.:2.581   1st Qu.:16.89   1st Qu.:0.0000  
##  Median :3.695   Median :3.325   Median :17.71   Median :0.0000  
##  Mean   :3.597   Mean   :3.217   Mean   :17.85   Mean   :0.4375  
##  3rd Qu.:3.920   3rd Qu.:3.610   3rd Qu.:18.90   3rd Qu.:1.0000  
##  Max.   :4.930   Max.   :5.424   Max.   :22.90   Max.   :1.0000  
##        am              gear            carb      
##  Min.   :0.0000   Min.   :3.000   Min.   :1.000  
##  1st Qu.:0.0000   1st Qu.:3.000   1st Qu.:2.000  
##  Median :0.0000   Median :4.000   Median :2.000  
##  Mean   :0.4062   Mean   :3.688   Mean   :2.812  
##  3rd Qu.:1.0000   3rd Qu.:4.000   3rd Qu.:4.000  
##  Max.   :1.0000   Max.   :5.000   Max.   :8.000
str(mtcars)
## 'data.frame':    32 obs. of  11 variables:
##  $ mpg : num  21 21 22.8 21.4 18.7 18.1 14.3 24.4 22.8 19.2 ...
##  $ cyl : num  6 6 4 6 8 6 8 4 4 6 ...
##  $ disp: num  160 160 108 258 360 ...
##  $ hp  : num  110 110 93 110 175 105 245 62 95 123 ...
##  $ drat: num  3.9 3.9 3.85 3.08 3.15 2.76 3.21 3.69 3.92 3.92 ...
##  $ wt  : num  2.62 2.88 2.32 3.21 3.44 ...
##  $ qsec: num  16.5 17 18.6 19.4 17 ...
##  $ vs  : num  0 0 1 1 0 1 0 1 1 1 ...
##  $ am  : num  1 1 1 0 0 0 0 0 0 0 ...
##  $ gear: num  4 4 4 3 3 3 3 4 4 4 ...
##  $ carb: num  4 4 1 1 2 1 4 2 2 4 ...

3. Do histograms of the variables - are they normally distributed? Feel free to play around with the function parameters, to make the plots more beautiful. Hint: check the documentation of the function to check what parameters you can control.

4.Plot several plots next to each other.

5.Do the shapiro test to make sure what you see is right.

par(mar=c(1,1,1,1)+0)

attach(mtcars)
par(mfrow=c(2,3), mai = c(1, 0.1, 0.1, 0.1))
hist(mpg)
hist(cyl)
hist(disp)
qqnorm(mpg); qqline(mpg, col = 2)
qqnorm(cyl); qqline(cyl, col = 2)
qqnorm(disp); qqline(disp, col = 2)

shapiro.test(mpg)
## 
##  Shapiro-Wilk normality test
## 
## data:  mpg
## W = 0.9476, p-value = 0.1229
shapiro.test(cyl)
## 
##  Shapiro-Wilk normality test
## 
## data:  cyl
## W = 0.7533, p-value = 6.058e-06
shapiro.test(disp)
## 
##  Shapiro-Wilk normality test
## 
## data:  disp
## W = 0.92, p-value = 0.02081
par(mfrow=c(2,3))
hist(hp)
hist(drat)
hist(wt)
qqnorm(hp); qqline(hp, col = 2)
qqnorm(drat); qqline(drat, col = 2)
qqnorm(wt); qqline(wt, col = 2)

shapiro.test(hp)
## 
##  Shapiro-Wilk normality test
## 
## data:  hp
## W = 0.9334, p-value = 0.04881
shapiro.test(drat)
## 
##  Shapiro-Wilk normality test
## 
## data:  drat
## W = 0.9459, p-value = 0.1101
shapiro.test(wt)
## 
##  Shapiro-Wilk normality test
## 
## data:  wt
## W = 0.9433, p-value = 0.09265
par(mfrow=c(2,3), mai = c(1, 0.1, 0.1, 0.1))
hist(qsec)
hist(vs)
hist(am)
qqnorm(qsec); qqline(qsec, col = 2)
qqnorm(vs); qqline(vs, col = 2)
qqnorm(am); qqline(am, col = 2)

shapiro.test(qsec)
## 
##  Shapiro-Wilk normality test
## 
## data:  qsec
## W = 0.9733, p-value = 0.5935
shapiro.test(vs)
## 
##  Shapiro-Wilk normality test
## 
## data:  vs
## W = 0.6323, p-value = 9.737e-08
shapiro.test(am)
## 
##  Shapiro-Wilk normality test
## 
## data:  am
## W = 0.6251, p-value = 7.836e-08
par(mfrow=c(2,2))
hist(carb)
qqnorm(carb);qqline(carb,col=2)
hist(gear)
qqnorm(gear); qqline(gear, col = 2)

shapiro.test(carb)
## 
##  Shapiro-Wilk normality test
## 
## data:  carb
## W = 0.8511, p-value = 0.0004382
shapiro.test(gear)
## 
##  Shapiro-Wilk normality test
## 
## data:  gear
## W = 0.7728, p-value = 1.307e-05
attach(mtcars)
## The following objects are masked from mtcars (pos = 3):
## 
##     am, carb, cyl, disp, drat, gear, hp, mpg, qsec, vs, wt
par(mfrow=c(2,2), mai = c(1, 0.1, 0.1, 0.1))
plot(wt,mpg)
plot(wt,qsec)
plot(drat,wt)

Conclusions:

Variables ‘disp’ and ‘hp’ are not normally distributed since the p-value of Shapiro’s test is lower than 0,05 We can assume that other variables tested in this example are normally distributed