Case study analysis on ‘mtcars’ dataset

Motor Trend Car Road Tests

Description:

The data was extracted from the 1974 Motor Trend US magazine, and comprises fuel consumption and 11 aspects of automobile design and performance for 32 automobiles (1973–74 models).

A data frame with 32 observations on 11 (numeric) variables.

[, 1] mpg Miles/(US) gallon

[, 2] cyl Number of cylinders

[, 3] disp Displacement (cu.in.)

[, 4] hp Gross horsepower

[, 5] drat Rear axle ratio

[, 6] wt Weight (1000 lbs)

[, 7] qsec 1/4 mile time

[, 8] vs Engine (0 = V-shaped, 1 = straight)

[, 9] am Transmission (0 = automatic, 1 = manual)

[,10] gear Number of forward gears

[,11] carb Number of carburetors

Let us analyze the dataset-mtcars:

Step 1: Load the datasets

data=mtcars

step 2: Let us view the dataset

View(data)

#Load the dataset first in gobal environment

#‘V’ should be capital and dataset name should be equal i.e, Data not eual to data

#Since it is case sensitive

step 3: Let us check the struture of dataset

#To check the structure and types of variables (i.e, is it numerical/charaters etc)

str(data)
## 'data.frame':    32 obs. of  11 variables:
##  $ mpg : num  21 21 22.8 21.4 18.7 18.1 14.3 24.4 22.8 19.2 ...
##  $ cyl : num  6 6 4 6 8 6 8 4 4 6 ...
##  $ disp: num  160 160 108 258 360 ...
##  $ hp  : num  110 110 93 110 175 105 245 62 95 123 ...
##  $ drat: num  3.9 3.9 3.85 3.08 3.15 2.76 3.21 3.69 3.92 3.92 ...
##  $ wt  : num  2.62 2.88 2.32 3.21 3.44 ...
##  $ qsec: num  16.5 17 18.6 19.4 17 ...
##  $ vs  : num  0 0 1 1 0 1 0 1 1 1 ...
##  $ am  : num  1 1 1 0 0 0 0 0 0 0 ...
##  $ gear: num  4 4 4 3 3 3 3 4 4 4 ...
##  $ carb: num  4 4 1 1 2 1 4 2 2 4 ...

step 4: Let us develop a model - Simple regression (one dependent varibale and one independent varibale)

#Simple regression talks about causes and effects

Dependent = Miles/(US) gallon

Independent = Number of cylinders

Explain the equation in detailed information

y=dependent variable, x=independent variable

a=constant (y-intercept)-represent point where line intersect the y-axis when x=0 (i.e, value of y when x=o)

b=constant (slope) - represent rate if change of y with respect to x (i.e, indicate how much value of y changes for every unit increase in x)

Graphic representation:

If, b=‘+’ (upward slope),b=‘-’(downward slope), b-‘0’(horizontal line)

This equation interpreted, for any given value of x, value of y calculated by a,b(x)

Lets create a model with the above set varibales Before that, call the varibales into the gobal environment

mpg <- mtcars$mpg
cyl <- mtcars$cyl
model1 <- lm(mpg~cyl)
model1
## 
## Call:
## lm(formula = mpg ~ cyl)
## 
## Coefficients:
## (Intercept)          cyl  
##      37.885       -2.876

#$ symbol is use to filter the data by varibales

#lm is use to develop regression model

#dependent varibale is used first (i.e, value of y)

Interpretation:

Hypothetically, having zero cyl the mpg would be 37.885. In simple words, If a care run hypothethically with no cyl it will give 37.885 mpg With each cylinder more that the car has, its performance will decrease by 2.876 mpg.

step 5: Let us predict mpg of the cars

predict(model1, newdata=data.frame(cyl=0))
##        1 
## 37.88458
predict(model1, newdata=data.frame(cyl=1))
##        1 
## 35.00879
predict(model1, newdata=data.frame(cyl=2))
##      1 
## 32.133
predict(model1, newdata=data.frame(cyl=3))
##        1 
## 29.25721
predict(model1, newdata=data.frame(cyl=4))
##        1 
## 26.38142
predict(model1, newdata=data.frame(cyl=5))
##        1 
## 23.50563
predict(model1, newdata=data.frame(cyl=6))
##        1 
## 20.62984
predict(model1, newdata=data.frame(cyl=7))
##        1 
## 17.75405
predict(model1, newdata=data.frame(cyl=8))
##        1 
## 14.87826

#If the no.of cyl is increasing the mpg is decreasing

step 6: Let us develop a model - multiple regression

Dependent:Miles/(US) gallon

Independent:‘Number of cylinders’ and ‘Transmission (0 = automatic, 1 = manual)’

Develop a multiple regression output

am <- mtcars$am
model2 <- lm(mpg~cyl+am)
model2
## 
## Call:
## lm(formula = mpg ~ cyl + am)
## 
## Coefficients:
## (Intercept)          cyl           am  
##      34.522       -2.501        2.567

Interpretation:

In this model with the no.of cyl we have added transmission variables. Hypo, if the car is automatic and the no.of cyl is zero the mileage would be 34.522 mpg If everything else remains constant, per each cyl more that the car has, it will decrease by 2.501 mpg If everything else remain constant, if the car is manual its mileage will increase by 2.567 mpg

The End