Question 1

data(mtcars) ?mtcars summary(mtcars) str(mtcars)

There are no obvious errors or discrepencies but I noticed a couple interesting aspects of the data. First of all, Horsepower (hp) has a large range. Min = 52 and Max = 335. Potentially some outliers on the high end. Another variable that has noticebly spread is weight. The minimum weight is 1513 lbs and the maximum weight is 5424 lbs. Big variation. The most curious variable is cylinders. It takes only 3 values, like a catergorical variable but R views it as numeric.

Question 2

library(ggplot2) ggplot(mtcars, aes(x = wt, y = mpg)) + geom_point() + geom_smooth(method = “lm”, se = FALSE) + labs(title = “MPG vs Weight”, x = “Weight (1000lbs)”, y = “Miles per Gallon”)

The scatterplot shows a clear negative linear relationship between vehicle weight and miles per gallon. As weight increases, fuel efficieny decreases. The points follow a fairly strong downward trend, indicating that weight is an important predictor of fuel efficiency.

Question 3

ggplot(mtcars, aes(x = factor(cyl), y = mpg)) + geom_boxplot() + labs(title = “MPG by Number of Cylinders”, x = “Cylinders”, y = “Miles per Gallon”)

The boxplot graph shows a very obvious, clear relationship between the number of cylinders in an engine and fuel efficiency. Vehicles with 4 cylinders have the highest median mpg, followed by 6 and 8 cynlinder’s have the lowest mpg. This indicates a strong negative association between engine size and fuel economy. More cylinders = less miles per gallon

Question 4

cor_matrix <- cor(mtcars) cor_matrix sort(cor(mtcars)[, “mpg”])

No missing values

Question 6

boxplot(mtcars, main = “Boxplots of mtcars Variables”)

There are significant outliers. The first is in the Horsepower variable in 355 hp. It is a high end outlier. Another variable with a potential variable is Displacement with a high value around 470, lilkely representing a large engine vehicle.

Question 7

min(mtcars\(hp) max(mtcars\)hp) mtcars\(hp_rs <- (mtcars\)hp - min(mtcars\(hp)) / (max(mtcars\)hp) - min(mtcars\(hp)) min(mtcars\)hp_rs) max(mtcars$hp_rs)

I applied range standardization to the variable hp using the formula (x − min) / (max − min). The new variable hp_rs was added to the mtcars data frame. The minimum value of hp_rs is 0 and the maximum value is 1, confirming that the transformation was applied correctly.

Question 8

lower <- quantile(mtcars\(wt, 0.05) upper <- quantile(mtcars\)wt, 0.95)

lower upper

mtcars\(wt_win <- ifelse(mtcars\)wt < lower, lower, ifelse(mtcars\(wt > upper, upper, mtcars\)wt)) min(mtcars\(wt_win) max(mtcars\)wt_win)

##I winsorized the variable wt at the 5th and 95th percentiles by replacing values below the 5th percentile with the 5th percentile value and values above the 95th percentile with the 95th percentile value. The new variable wt_win was created in the mtcars dataset. The minimum of wt_win equals the 5th percentile value, and the maximum equals the 95th percentile value, confirming that the extreme values were capped.