Code
library(ggplot2)Warning: package 'ggplot2' was built under R version 4.2.3
Code
ggplot(mtcars, aes(factor(carb), fill=factor(cyl))) + geom_bar()library(ggplot2)Warning: package 'ggplot2' was built under R version 4.2.3
ggplot(mtcars, aes(factor(carb), fill=factor(cyl))) + geom_bar()ggplot(mtcars, aes(x=mpg, y=hp))+
geom_point() +
xlab("Miles per Gallon")+
ylab("Horsepower")+
theme_minimal()model <- lm(mpg~ hp, data = mtcars)
summary(model)
Call:
lm(formula = mpg ~ hp, data = mtcars)
Residuals:
Min 1Q Median 3Q Max
-5.7121 -2.1122 -0.8854 1.5819 8.2360
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 30.09886 1.63392 18.421 < 2e-16 ***
hp -0.06823 0.01012 -6.742 1.79e-07 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 3.863 on 30 degrees of freedom
Multiple R-squared: 0.6024, Adjusted R-squared: 0.5892
F-statistic: 45.46 on 1 and 30 DF, p-value: 1.788e-07
Based on this model summary, the coefficient for hp is statistically significant (p-value < 0.001), indicating that there is a significant relationship between horsepower and miles per gallon. The negative coefficient (-0.06823) suggests that as horsepower increases, the miles per gallon tends to decrease.
par(mfrow = c(1,2))
plot(model)library(performance)Warning: package 'performance' was built under R version 4.2.3
library(see)Warning: package 'see' was built under R version 4.2.3
library(patchwork)Warning: package 'patchwork' was built under R version 4.2.3
theme_set(theme_classic(base_size = 2))
check_model(model)Not enough model terms in the conditional part of the model to check for
multicollinearity.
check_outliers(model)1 outlier detected: case 31.
- Based on the following method and threshold: cook (0.709).
- For variable: (Whole model).
check_normality(model)Warning: Non-normality of residuals detected (p = 0.022).
check_distribution(model)# Distribution of Model Family
Predicted Distribution of Residuals
Distribution Probability
normal 53%
cauchy 38%
tweedie 6%
Predicted Distribution of Response
Distribution Probability
tweedie 44%
chi 28%
beta-binomial 16%
model_1 <- lm(mpg ~hp + drat, data = mtcars)
summary(model_1)
Call:
lm(formula = mpg ~ hp + drat, data = mtcars)
Residuals:
Min 1Q Median 3Q Max
-5.0369 -2.3487 -0.6034 1.1897 7.7500
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 10.789861 5.077752 2.125 0.042238 *
hp -0.051787 0.009293 -5.573 5.17e-06 ***
drat 4.698158 1.191633 3.943 0.000467 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 3.17 on 29 degrees of freedom
Multiple R-squared: 0.7412, Adjusted R-squared: 0.7233
F-statistic: 41.52 on 2 and 29 DF, p-value: 3.081e-09
Residuals: These are the differences between the observed mpg values and the predicted values from the model. The summary provides statistics such as the minimum, 1st quartile, median, 3rd quartile, and maximum values of the residuals.
Coefficients: The coefficients section presents the estimates, standard errors, t-values, and p-values for each predictor variable, as well as the intercept. Here’s the interpretation of the coefficients:
Intercept: The estimated intercept is 10.789861, indicating the expected value of mpg when both hp and drat are zero. The p-value (0.042238) suggests that the intercept is statistically significant at a significance level of 0.05.
hp: The estimated coefficient is -0.051787, indicating that, on average, each unit increase in horsepower (hp) is associated with a decrease of 0.051787 in the predicted mpg value. The low p-value (5.17e-06) suggests that this coefficient is statistically significant.
drat: The estimated coefficient is 4.698158, suggesting that, on average, each unit increase in the rear axle ratio (drat) is associated with an increase of 4.698158 in the predicted mpg value. The low p-value (0.000467) indicates that this coefficient is statistically significant.
Residual standard error: This value represents the estimated standard deviation of the residuals, indicating the average distance between the observed mpg values and the predicted values from the model.
Multiple R-squared and Adjusted R-squared: These values measure the goodness of fit of the model. The multiple R-squared represents the proportion of variance in the response variable (mpg) explained by the predictor variables (hp and drat). The adjusted R-squared takes into account the number of predictors and the sample size. In this case, the multiple R-squared is 0.7412, indicating that approximately 74.12% of the variability in mpg can be explained by the predictor variables in the model.
F-statistic and p-value: The F-statistic tests the overall significance of the model, comparing the variance explained by the model to the residual variance. The low p-value (3.081e-09) indicates that the model as a whole is statistically significant, suggesting that at least one of the predictor variables is significantly associated with the mpg values.
check_outliers(model_1)OK: No outliers detected.
- Based on the following method and threshold: cook (0.808).
- For variable: (Whole model)
check_collinearity(model_1)# Check for Multicollinearity
Low Correlation
Term VIF VIF 95% CI Increased SE Tolerance Tolerance 95% CI
hp 1.25 [1.04, 2.42] 1.12 0.80 [0.41, 0.96]
drat 1.25 [1.04, 2.42] 1.12 0.80 [0.41, 0.96]
check_distribution(model_1)# Distribution of Model Family
Predicted Distribution of Residuals
Distribution Probability
normal 50%
cauchy 41%
F 3%
Predicted Distribution of Response
Distribution Probability
tweedie 44%
chi 28%
beta-binomial 16%
check_normality(model_1)Warning: Non-normality of residuals detected (p = 0.024).
check_predictions(model_1)check_posterior_predictions(model_1)