I have downloaded the .csv file of automobile that contains information such as type of cylinders, displacement, horsepower, weight, model and origin of the cars. I have also uploaded the .csv file into my github which can be found on https://github.com/karkigv/Cars/blob/main/Automobile_data.csv

print(Question Number: 2 (a) )

df<-read.csv("Automobile_data.csv")
#This is the mean of the length of the cars.
mean(df$length, trim=0, na.rm=FALSE)
## [1] 174.0493
#This is the standard deviation of the length of the cars.
sd(df$length)
## [1] 12.33729
#This is the five number summary of the lenght of the cars.
summary(df$length)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   141.1   166.3   173.2   174.0   183.1   208.1
#This is the histogram showing the lenght of the cars.
hist(df$length,
     main="Legth of the car (Positive Skewed Distribution)",
     col="pink",
     probability= TRUE)

#This is the boxplot of the length of the car.
boxplot(df$length, main="This is a graph plotting lenght of the car", xlab="Quartile", ylab="length",col ="blue" )

qqnorm(df$length, col="Purple")
qqline(df$length, col="Blue")

Yes. There are outliers.

Question Number:2 (b) This is a graphical display of the correlation between length of the cars and price of the cars.

cr<-read.csv("Automobile_data.csv")
plot(cr$length, cr$price)
## Warning in xy.coords(x, y, xlabel, ylabel, log): NAs introduced by coercion

Question Number:2(c) The following categorical value (Model) is shown below with the number of their car in the .csv file:

table(df$make)
## 
##   alfa-romero          audi           bmw     chevrolet         dodge 
##             3             7             8             3             9 
##         honda         isuzu        jaguar         mazda mercedes-benz 
##            13             4             3            17             8 
##       mercury    mitsubishi        nissan        peugot      plymouth 
##             1            13            18            11             7 
##       porsche       renault          saab        subaru        toyota 
##             5             2             6            12            32 
##    volkswagen         volvo 
##            12            11

This is the relative frequency table of the cars presented in the .csv file.

table(df$make)/length(df$make)
## 
##   alfa-romero          audi           bmw     chevrolet         dodge 
##   0.014634146   0.034146341   0.039024390   0.014634146   0.043902439 
##         honda         isuzu        jaguar         mazda mercedes-benz 
##   0.063414634   0.019512195   0.014634146   0.082926829   0.039024390 
##       mercury    mitsubishi        nissan        peugot      plymouth 
##   0.004878049   0.063414634   0.087804878   0.053658537   0.034146341 
##       porsche       renault          saab        subaru        toyota 
##   0.024390244   0.009756098   0.029268293   0.058536585   0.156097561 
##    volkswagen         volvo 
##   0.058536585   0.053658537

Question Number:2(d) This is a two-way table which has the make on the left side and number of vehicle that intakes whether diesel or gas on the right side.

two_way_table <-table(df$make, df$fuel.type)
two_way_table
##                
##                 diesel gas
##   alfa-romero        0   3
##   audi               0   7
##   bmw                0   8
##   chevrolet          0   3
##   dodge              0   9
##   honda              0  13
##   isuzu              0   4
##   jaguar             0   3
##   mazda              2  15
##   mercedes-benz      4   4
##   mercury            0   1
##   mitsubishi         0  13
##   nissan             1  17
##   peugot             5   6
##   plymouth           0   7
##   porsche            0   5
##   renault            0   2
##   saab               0   6
##   subaru             0  12
##   toyota             3  29
##   volkswagen         4   8
##   volvo              1  10

As we can say there is almost no relationship between Gas and Diesel because most of the manufactures turned diesel-intake vehicles to gas-powered cars due to various reasons.

Question Number:2(e) Side-by-side plot for categorical and quantitative variable. Here, I have choosed engine size which is a quantitative variable and make which is a categorical varaible.

boxplot(df$engine.size ~ df$make, col="yellow", main=" Plot of make along with their engine size", xlab="Make", ylab = "engine size")