I.

1.

# Load necessary libraries
library(datasets)
library(stargazer)
## 
## Please cite as:
##  Hlavac, Marek (2022). stargazer: Well-Formatted Regression and Summary Statistics Tables.
##  R package version 5.2.3. https://CRAN.R-project.org/package=stargazer
# Load the mtcars dataset
data(mtcars)

# View summary statistics of the dataset
summary(mtcars)
##       mpg             cyl             disp             hp       
##  Min.   :10.40   Min.   :4.000   Min.   : 71.1   Min.   : 52.0  
##  1st Qu.:15.43   1st Qu.:4.000   1st Qu.:120.8   1st Qu.: 96.5  
##  Median :19.20   Median :6.000   Median :196.3   Median :123.0  
##  Mean   :20.09   Mean   :6.188   Mean   :230.7   Mean   :146.7  
##  3rd Qu.:22.80   3rd Qu.:8.000   3rd Qu.:326.0   3rd Qu.:180.0  
##  Max.   :33.90   Max.   :8.000   Max.   :472.0   Max.   :335.0  
##       drat             wt             qsec             vs        
##  Min.   :2.760   Min.   :1.513   Min.   :14.50   Min.   :0.0000  
##  1st Qu.:3.080   1st Qu.:2.581   1st Qu.:16.89   1st Qu.:0.0000  
##  Median :3.695   Median :3.325   Median :17.71   Median :0.0000  
##  Mean   :3.597   Mean   :3.217   Mean   :17.85   Mean   :0.4375  
##  3rd Qu.:3.920   3rd Qu.:3.610   3rd Qu.:18.90   3rd Qu.:1.0000  
##  Max.   :4.930   Max.   :5.424   Max.   :22.90   Max.   :1.0000  
##        am              gear            carb      
##  Min.   :0.0000   Min.   :3.000   Min.   :1.000  
##  1st Qu.:0.0000   1st Qu.:3.000   1st Qu.:2.000  
##  Median :0.0000   Median :4.000   Median :2.000  
##  Mean   :0.4062   Mean   :3.688   Mean   :2.812  
##  3rd Qu.:1.0000   3rd Qu.:4.000   3rd Qu.:4.000  
##  Max.   :1.0000   Max.   :5.000   Max.   :8.000
data <- mtcars
# Run the regression model
model <- lm(mpg ~ hp + wt, data = data)

# View summary statistics of the model
summary(model)
## 
## Call:
## lm(formula = mpg ~ hp + wt, data = data)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -3.941 -1.600 -0.182  1.050  5.854 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 37.22727    1.59879  23.285  < 2e-16 ***
## hp          -0.03177    0.00903  -3.519  0.00145 ** 
## wt          -3.87783    0.63273  -6.129 1.12e-06 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 2.593 on 29 degrees of freedom
## Multiple R-squared:  0.8268, Adjusted R-squared:  0.8148 
## F-statistic: 69.21 on 2 and 29 DF,  p-value: 9.109e-12
# Presenting the regression result with stargazer
stargazer(model, type = "text")
## 
## ===============================================
##                         Dependent variable:    
##                     ---------------------------
##                                 mpg            
## -----------------------------------------------
## hp                           -0.032***         
##                               (0.009)          
##                                                
## wt                           -3.878***         
##                               (0.633)          
##                                                
## Constant                     37.227***         
##                               (1.599)          
##                                                
## -----------------------------------------------
## Observations                    32             
## R2                             0.827           
## Adjusted R2                    0.815           
## Residual Std. Error       2.593 (df = 29)      
## F Statistic           69.211*** (df = 2; 29)   
## ===============================================
## Note:               *p<0.1; **p<0.05; ***p<0.01

2.

  1. SIGN: A negative coefficient for hp would indicate that as horsepower increases, miles per gallon (mpg) decrease, which is expected since more powerful engines often consume more fuel. Similarly, a negative coefficient for wt would suggest that heavier cars tend to have lower fuel efficiency, which aligns with general expectations.

  2. MAGNITUDE: The magnitude of the coefficients tells us about the strength of the relationship. For instance, a coefficient of -0.032 for hp would imply that for each additional horsepower, the mpg decreases by 0.032 units, holding other factors constant. The practical significance of these magnitudes depends on the context and the units of measurement.

  3. STATISTICAL SIGNIFICANCE: the p-value for the coefficient of hp is less than 0.01, which indicates a statistically significant relationship between horsepower and mpg.

3.

A Residual Std. Error of 2.593 indicates that the typical deviation of the observed mpg values from those predicted by the model is approximately 2.593 miles per gallon. Considering the range and IQR of the mpg data, the RSE seems reasonably small in comparison. It indicates that the model predicts the dependent variable accurately.

4.

plot(model)

The Gauss-Markov Assumptions’ conditions:

  1. The relationship between the independent variables and the dependent variable should be linear.
  2. There should be no autocorrelation in the residuals.
  3. The variance of error terms should be constant across all levels of the independent variables.
  4. The independent variables should not be perfectly correlated with each other.

They don’t hold because they don’t satisfy the second rules.

5.

The OLS estimators of the coefficients in a linear regression model are the best linear unbiased estimators. It has the lowest variance among all unbiased linear estimators.

II.

  1. The first reason is to handle skewed data. Many real-world datasets contain variables that are not normally distributed but heavily skewed, often right-skewed.
  2. The second reason is the interpretation of regression coefficients, particularly when it simplifies understanding complex relationships.