#Part 1: Mtcars dataset

hist(mtcars$mpg)

summary(mtcars$mpg)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   10.40   15.43   19.20   20.09   22.80   33.90
plot(mtcars$mpg,mtcars$hp)

boxplot(mtcars$mpg, mtcars$cyl)

dfLow <- mtcars[(mtcars$mpg<=mean(mtcars$mpg)), ]
dfLow
##                      mpg cyl  disp  hp drat    wt  qsec vs am gear carb
## Hornet Sportabout   18.7   8 360.0 175 3.15 3.440 17.02  0  0    3    2
## Valiant             18.1   6 225.0 105 2.76 3.460 20.22  1  0    3    1
## Duster 360          14.3   8 360.0 245 3.21 3.570 15.84  0  0    3    4
## Merc 280            19.2   6 167.6 123 3.92 3.440 18.30  1  0    4    4
## Merc 280C           17.8   6 167.6 123 3.92 3.440 18.90  1  0    4    4
## Merc 450SE          16.4   8 275.8 180 3.07 4.070 17.40  0  0    3    3
## Merc 450SL          17.3   8 275.8 180 3.07 3.730 17.60  0  0    3    3
## Merc 450SLC         15.2   8 275.8 180 3.07 3.780 18.00  0  0    3    3
## Cadillac Fleetwood  10.4   8 472.0 205 2.93 5.250 17.98  0  0    3    4
## Lincoln Continental 10.4   8 460.0 215 3.00 5.424 17.82  0  0    3    4
## Chrysler Imperial   14.7   8 440.0 230 3.23 5.345 17.42  0  0    3    4
## Dodge Challenger    15.5   8 318.0 150 2.76 3.520 16.87  0  0    3    2
## AMC Javelin         15.2   8 304.0 150 3.15 3.435 17.30  0  0    3    2
## Camaro Z28          13.3   8 350.0 245 3.73 3.840 15.41  0  0    3    4
## Pontiac Firebird    19.2   8 400.0 175 3.08 3.845 17.05  0  0    3    2
## Ford Pantera L      15.8   8 351.0 264 4.22 3.170 14.50  0  1    5    4
## Ferrari Dino        19.7   6 145.0 175 3.62 2.770 15.50  0  1    5    6
## Maserati Bora       15.0   8 301.0 335 3.54 3.570 14.60  0  1    5    8
dfHigh <- mtcars[(mtcars$mpg>mean(mtcars$mpg)), ]
dfHigh
##                 mpg cyl  disp  hp drat    wt  qsec vs am gear carb
## Mazda RX4      21.0   6 160.0 110 3.90 2.620 16.46  0  1    4    4
## Mazda RX4 Wag  21.0   6 160.0 110 3.90 2.875 17.02  0  1    4    4
## Datsun 710     22.8   4 108.0  93 3.85 2.320 18.61  1  1    4    1
## Hornet 4 Drive 21.4   6 258.0 110 3.08 3.215 19.44  1  0    3    1
## Merc 240D      24.4   4 146.7  62 3.69 3.190 20.00  1  0    4    2
## Merc 230       22.8   4 140.8  95 3.92 3.150 22.90  1  0    4    2
## Fiat 128       32.4   4  78.7  66 4.08 2.200 19.47  1  1    4    1
## Honda Civic    30.4   4  75.7  52 4.93 1.615 18.52  1  1    4    2
## Toyota Corolla 33.9   4  71.1  65 4.22 1.835 19.90  1  1    4    1
## Toyota Corona  21.5   4 120.1  97 3.70 2.465 20.01  1  0    3    1
## Fiat X1-9      27.3   4  79.0  66 4.08 1.935 18.90  1  1    4    1
## Porsche 914-2  26.0   4 120.3  91 4.43 2.140 16.70  0  1    5    2
## Lotus Europa   30.4   4  95.1 113 3.77 1.513 16.90  1  1    5    2
## Volvo 142E     21.4   4 121.0 109 4.11 2.780 18.60  1  1    4    2
mtcars.stacked=rbind(dfLow,dfHigh)
mtcars.stacked
##                      mpg cyl  disp  hp drat    wt  qsec vs am gear carb
## Hornet Sportabout   18.7   8 360.0 175 3.15 3.440 17.02  0  0    3    2
## Valiant             18.1   6 225.0 105 2.76 3.460 20.22  1  0    3    1
## Duster 360          14.3   8 360.0 245 3.21 3.570 15.84  0  0    3    4
## Merc 280            19.2   6 167.6 123 3.92 3.440 18.30  1  0    4    4
## Merc 280C           17.8   6 167.6 123 3.92 3.440 18.90  1  0    4    4
## Merc 450SE          16.4   8 275.8 180 3.07 4.070 17.40  0  0    3    3
## Merc 450SL          17.3   8 275.8 180 3.07 3.730 17.60  0  0    3    3
## Merc 450SLC         15.2   8 275.8 180 3.07 3.780 18.00  0  0    3    3
## Cadillac Fleetwood  10.4   8 472.0 205 2.93 5.250 17.98  0  0    3    4
## Lincoln Continental 10.4   8 460.0 215 3.00 5.424 17.82  0  0    3    4
## Chrysler Imperial   14.7   8 440.0 230 3.23 5.345 17.42  0  0    3    4
## Dodge Challenger    15.5   8 318.0 150 2.76 3.520 16.87  0  0    3    2
## AMC Javelin         15.2   8 304.0 150 3.15 3.435 17.30  0  0    3    2
## Camaro Z28          13.3   8 350.0 245 3.73 3.840 15.41  0  0    3    4
## Pontiac Firebird    19.2   8 400.0 175 3.08 3.845 17.05  0  0    3    2
## Ford Pantera L      15.8   8 351.0 264 4.22 3.170 14.50  0  1    5    4
## Ferrari Dino        19.7   6 145.0 175 3.62 2.770 15.50  0  1    5    6
## Maserati Bora       15.0   8 301.0 335 3.54 3.570 14.60  0  1    5    8
## Mazda RX4           21.0   6 160.0 110 3.90 2.620 16.46  0  1    4    4
## Mazda RX4 Wag       21.0   6 160.0 110 3.90 2.875 17.02  0  1    4    4
## Datsun 710          22.8   4 108.0  93 3.85 2.320 18.61  1  1    4    1
## Hornet 4 Drive      21.4   6 258.0 110 3.08 3.215 19.44  1  0    3    1
## Merc 240D           24.4   4 146.7  62 3.69 3.190 20.00  1  0    4    2
## Merc 230            22.8   4 140.8  95 3.92 3.150 22.90  1  0    4    2
## Fiat 128            32.4   4  78.7  66 4.08 2.200 19.47  1  1    4    1
## Honda Civic         30.4   4  75.7  52 4.93 1.615 18.52  1  1    4    2
## Toyota Corolla      33.9   4  71.1  65 4.22 1.835 19.90  1  1    4    1
## Toyota Corona       21.5   4 120.1  97 3.70 2.465 20.01  1  0    3    1
## Fiat X1-9           27.3   4  79.0  66 4.08 1.935 18.90  1  1    4    1
## Porsche 914-2       26.0   4 120.3  91 4.43 2.140 16.70  0  1    5    2
## Lotus Europa        30.4   4  95.1 113 3.77 1.513 16.90  1  1    5    2
## Volvo 142E          21.4   4 121.0 109 4.11 2.780 18.60  1  1    4    2

1.1 Conclusion about histogram:

The most common mpg range for cars in this dataset is between 15 and 20 mpg, as this bin has the highest frequency. The distribution appears to be left-skewed, indicating that there are fewer cars that achieve higher mpg (above 25 mpg). There is considerable variability in the fuel efficiency of the cars, as indicated by the spread of the data across several mpg categories (from below 15 mpg to over 30 mpg). There are fewer cars that are extremely fuel-efficient (above 30 mpg), suggesting that very high fuel efficiency is less common among the cars in this dataset.

1.2. Conclusion by comparing mean and median:

The mean mpg is 20.09, and the median mpg is 19.20. The mean is slightly higher than the median.When the mean is higher than the median, it typically indicates that the distribution of the data is skewed to the right. This means there are outliers or an extended tail on the higher end of the mpg values. This effect pulls the mean upwards more than the median, suggesting the presence of some cars with very high mpg values that affect the average more than the central tendency.Because the mean is only slightly higher than the median, the presence of high mpg values (such as the maximum of 33.90) does affect the mean but doesn’t dramatically shift it from the median.

  1. Conclusion the relationship between mpg and hp: -There is a visible trend indicating a negative correlation between mpg and horsepower. As horsepower increases, the mpg tends to decrease. This suggests that cars with higher horsepower are generally less fuel efficient. The spread of data points shows that cars with lower horsepower (less than about 150 hp) have a wide range of fuel efficiency, spanning from 10 to over 30 mpg. However, as horsepower increases beyond 150 hp, the range of mpg narrows significantly, and no cars with horsepower greater than 250 hp have mpg above 20. There are a few outliers, particularly cars with relatively low horsepower but very low mpg (around 10 mpg) and cars with moderate horsepower but very high mpg (over 30 mpg). These outliers may represent specific car models with unique characteristics.

  2. Conclusion: box plot between mpg and cyl

#Part 2: Iris dataset

You can also embed plots, for example:

plot(iris$Sepal.Length, iris$Sepal.Width)

plot(iris$Species, iris$Petal.Length)

  1. Data type of each column in the data set
  1. Conclusion about plot between Sepal Width and Sepal Length

The scatter plot shows a positive correlation between sepal length and sepal width, meaning as sepal length increases, sepal width tends to increase slightly as well. The concentration of data points around the middle range of sepal lengths (5.5 to 6.5) and sepal widths (2.5 to 3.5) indicates that these are common sizes for the Irises in the dataset. While there’s some indication of a relationship between sepal length and sepal width, the correlation is not strong. Most of the Iris flowers cluster within a middle range for both sepal dimensions, suggesting that these dimensions, while variable, do not show extreme differences across observations within the dataset.

  1. Conclusion about plot between Petal Length and Species

Setosa: This species has the smallest variation in the measured variable, with a tight interquartile range and a median close to 2.0. There’s a notable outlier on the lower side. Versicolor: Versicolor displays a wider range and higher median around 3.0, with a single outlier on the lower side. Virginica: Shows the highest median value, around 3.5, with a moderate spread similar to Versicolor but without noticeable outliers. Conclusion: Setosa species tend to have significantly lower values of the measured trait compared to Versicolor and Virginica. The latter two have overlapping ranges but Virginica tends to have higher median values.