library(dplyr)
library(dslabs)
library(ISLR2)
library(matlib)
library(wooldridge)

Chapter 2

C9: COUNTY MURDERS

data("countymurders")
data1 <- countymurders
attach(data1)
data1 <- data1 %>% filter(year=="1996")
head(data1)
##   arrests countyid  density  popul perc1019 perc2029 percblack percmale
## 1       8     1001 67.21535  40061 15.89077 13.17491 20.975510 48.70073
## 2       6     1003 77.05643 123023 13.93886 11.63929 13.496660 48.83233
## 3       1     1005 29.91548  26475 15.06327 13.69972 46.190750 49.15203
## 4       0     1009 67.20457  43392 14.17542 12.99318  1.415007 48.97446
## 5       1     1011 17.89899  11188 14.98927 14.13121 72.756520 49.91956
## 6       2     1013 27.71148  21530 15.68509 11.25871 41.384110 46.81839
##   rpcincmaint rpcpersinc rpcunemins year murders  murdrate arrestrate statefips
## 1     192.038  11852.760     26.796 1996       7 1.7473350  1.9969550         1
## 2     139.084  13583.020     28.710 1996       6 0.4877137  0.4877137         1
## 3     405.768  10760.510     63.162 1996       1 0.3777148  0.3777148         1
## 4     184.382  11094.820     21.692 1996       2 0.4609145  0.0000000         1
## 5     485.518   8349.506     63.162 1996       0 0.0000000  0.8938148         1
## 6     357.918   9947.058     54.868 1996       2 0.9289364  0.9289364         1
##   countyfips execs    lpopul execrate
## 1          1     0 10.598160        0
## 2          3     0 11.720130        0
## 3          5     0 10.183960        0
## 4          9     0 10.678030        0
## 5         11     0  9.322598        0
## 6         13     0  9.977202        0
nrow(data1[data1$murders==0,])
## [1] 1051
nrow(data1[data1$execs>=1,])
## [1] 31
max(execs)
## [1] 7
  1. In 1996, there are 1051 counties having no murders, 31 counties have at least one execution, and the maximum number for execution is 7.
model_c9 <- lm(murders ~ execs)
summary(model_c9)
## 
## Call:
## lm(formula = murders ~ execs)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -202.23   -6.84   -5.84   -3.84 1937.16 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   6.8382     0.2418   28.28   <2e-16 ***
## execs        65.4650     2.1463   30.50   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 46.64 on 37347 degrees of freedom
## Multiple R-squared:  0.02431,    Adjusted R-squared:  0.02428 
## F-statistic: 930.4 on 1 and 37347 DF,  p-value: < 2.2e-16
  1. murders = 6.8382 + 65.4650*execs N = 2197, R-squared = 0.0243
  2. Coefficient slope B1 = 65.4650, interpret that one unit increasing in execution will lead to 65.465 unit increasing in number of murders.
  3. Based on the fomular above, the smallest number of murders is equal to the constrant B0 (6.8382) when the value of execution = 0.
  4. A simple regression model is not well-suited for analyzing the deterrent effect of capital punishment based on county murder data in 1996 due to issues such as endogeneity, omitted variable bias, heterogeneity across counties, and the potential for nonlinear relationships. Advanced econometric techniques, accounting for these complexities, are more appropriate for a nuanced understanding of the relationship between capital punishment and murder rates.

CHAPTER 3

c5

  1. No, because study = 168 - sleep - work - leisure.
  2. According to assumption MLR 3, in the sample (and therefore in the population), none of the independent variables is constant and there are no exact relationships among the independent variables. Since the sum of all independent variables is 168, there is an exect linear relationship between them. This violates the assumption MLR.3
  3. Since student age is under 18 year-old, they are almost not working yet, therefore, we can eliminate the working variable out of the formula since its effect is logically insignificant.

C10 : With y is dependent variable, x1, x2, x3 are independent variables, we have:

*y = a0 + a1x1 + e1*
*y = b0 + b1x1 + b2x2 + b3x3 + e2*
  1. We have the a1 = b1 + cov[(x1,x2); (x1; x3)]. There for, if x1 is highly related to x2 and x3, we will have the value of a1 is much greater than b1
  2. If x1 is uncorrelated with x2, x3 => the covariance between (x1,x2) and between (x1,x3) is insignificant, then a1 and b1 are the same.
  3. b1 will be smaller
  4. Both of those value are the same

C8: DISCRIM

data("discrim")
data2 <- discrim
attach(discrim)
mean(prpblck, na.rm = T)
## [1] 0.1134864
mean(income, na.rm = T)
## [1] 47053.78
sd(prpblck, na.rm = T)
## [1] 0.1824165
sd(income, na.rm = T)
## [1] 13179.29
  1. The averages for “prpblck” and “income” are 0.113 and 47,053, respectively. The standard deviations are likewise, 0.1824 and 13,179.29, respectively. It is apparent that prpblck represents a proportion of the black population, while income is represented in dollar terms.
model_c8 <- lm(psoda~prpblck+income, data = discrim)
summary(model_c8)
## 
## Call:
## lm(formula = psoda ~ prpblck + income, data = discrim)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.29401 -0.05242  0.00333  0.04231  0.44322 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 9.563e-01  1.899e-02  50.354  < 2e-16 ***
## prpblck     1.150e-01  2.600e-02   4.423 1.26e-05 ***
## income      1.603e-06  3.618e-07   4.430 1.22e-05 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.08611 on 398 degrees of freedom
##   (9 observations deleted due to missingness)
## Multiple R-squared:  0.06422,    Adjusted R-squared:  0.05952 
## F-statistic: 13.66 on 2 and 398 DF,  p-value: 1.835e-06
  1. The resulting regression is psoda.hat = (0.956) + (0.115)prpblck + (0.0000016). The optimal sample size is 399 observations (indicated by the 398 degrees of freedom and 9 missing observations) and the adjusted R^2 is 0.595. The coefficient on prpblck indicates that, all things being equal, if prpblck increases by 10% the price of soda will increase by approximately 1.2 cents, which is not economically significant.
model_c8_1 <- lm(psoda~prpblck, data = discrim)
summary(model_c8_1)
## 
## Call:
## lm(formula = psoda ~ prpblck, data = discrim)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.30884 -0.05963  0.01135  0.03206  0.44840 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  1.03740    0.00519  199.87  < 2e-16 ***
## prpblck      0.06493    0.02396    2.71  0.00702 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.0881 on 399 degrees of freedom
##   (9 observations deleted due to missingness)
## Multiple R-squared:  0.01808,    Adjusted R-squared:  0.01561 
## F-statistic: 7.345 on 1 and 399 DF,  p-value: 0.007015
  1. The estimate of the coefficient on prpblack with the simple regression is 0.065. This is lower than the prior estimate, and therefore shows that the discrimination effect decreases when income is excluded.
model_c8_2 <- lm(log(psoda)~prpblck+log(income), data = discrim)
summary(model_c8_2)
## 
## Call:
## lm(formula = log(psoda) ~ prpblck + log(income), data = discrim)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.33563 -0.04695  0.00658  0.04334  0.35413 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept) -0.79377    0.17943  -4.424 1.25e-05 ***
## prpblck      0.12158    0.02575   4.722 3.24e-06 ***
## log(income)  0.07651    0.01660   4.610 5.43e-06 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.0821 on 398 degrees of freedom
##   (9 observations deleted due to missingness)
## Multiple R-squared:  0.06809,    Adjusted R-squared:  0.06341 
## F-statistic: 14.54 on 2 and 398 DF,  p-value: 8.039e-07
paste( (0.2*100)*0.122, "percent increase")
## [1] "2.44 percent increase"
model_c8_3 <- lm(log(psoda)~prpblck+log(income)+prppov)
model_c8_3
## 
## Call:
## lm(formula = log(psoda) ~ prpblck + log(income) + prppov)
## 
## Coefficients:
## (Intercept)      prpblck  log(income)       prppov  
##    -1.46333      0.07281      0.13696      0.38036

Adding prppov causes the prpblck coefficient to fall to 0.0738.

cor(log(discrim$income), discrim$prppov, method = "pearson", use = "complete.obs")
## [1] -0.838467
  1. The correlation is approximately -0.838. This makes sense, because one would expect that declines in income would result in higher poverty rates.

CHAPTER 4

C3: RDINTENS

data("rdchem")
attach(rdchem)
paste(log(1.1)*0.321*100, "percentages change in rdintens ")
## [1] "3.05945677171883 percentages change in rdintens "
  1. If log(sales) increases for 1 unit, rdintens increases for 0.321 unit. If sales increases for 10%, rdintens will increases for 3%, not a big change.
model_c3 <- lm(rdintens ~ log(sales) + profmarg)
summary(model_c3)
## 
## Call:
## lm(formula = rdintens ~ log(sales) + profmarg)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -2.3016 -1.2707 -0.6895  0.8785  6.0369 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)
## (Intercept)  0.47225    1.67606   0.282    0.780
## log(sales)   0.32135    0.21557   1.491    0.147
## profmarg     0.05004    0.04578   1.093    0.283
## 
## Residual standard error: 1.839 on 29 degrees of freedom
## Multiple R-squared:  0.09847,    Adjusted R-squared:  0.0363 
## F-statistic: 1.584 on 2 and 29 DF,  p-value: 0.2224
  1. The 5% critical value for a one-tailedtest, with df = 32 − 3 = 29, is obtained from Table G.2 as 1.699; so we cannot reject H0 at the 5% level. But the 10% critical value is 1.311; since the t statistic is abovethis value, we reject H0 in favor of H1 at the 10% level.

  2. One percentage unit change in profmarg interprets 0.05 percentage unit change in rdintens. Economically that is quite significant, as given a10 % increase in profit margin then they will increase expenditures on R& D by 0.5 percentage point.

###C8: 401 KSUBS

data("k401ksubs")
data4 <- k401ksubs %>% filter(fsize==1)
dim(data4)
## [1] 2017   11
attach(data4)
  1. There are 2017 single-person households in the dataframe
model_c4 <- lm(nettfa ~ inc+ age)
summary(model_c4)
## 
## Call:
## lm(formula = nettfa ~ inc + age)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -179.95  -14.16   -3.42    6.03 1113.94 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept) -43.03981    4.08039 -10.548   <2e-16 ***
## inc           0.79932    0.05973  13.382   <2e-16 ***
## age           0.84266    0.09202   9.158   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 44.68 on 2014 degrees of freedom
## Multiple R-squared:  0.1193, Adjusted R-squared:  0.1185 
## F-statistic: 136.5 on 2 and 2014 DF,  p-value: < 2.2e-16
min(age)
## [1] 25
min(inc)
## [1] 10.008
  1. nettfa = -43.04 + 0.799inc + 0.842age (4.08) (.0597) (.0920) n = 2017, R^2 = .1185 One unit increasing in inc interprets .07993 unit increasing in nettfa One unit increasing in age interprets .84266 unit increasing in nettfa The model is significant, based on when age goes up, the income goes up and lead to the net financial wealth also go up.
income <- lm(inc ~ age)
income
## 
## Call:
## lm(formula = inc ~ age)
## 
## Coefficients:
## (Intercept)          age  
##    27.08278      0.06017
cov(log(inc), age)
## [1] 0.1907963
  1. The correlation between age and income is very high, which leads to the perfect colinearity

CHAPTER 5

5: ECONMATH

  1. The probability that a score exceeds 100, according to a normal distribution, wouldn’t be zero. This contradicts the assumption of a normal distribution where the probability of a specific point value in a continuous distribution is technically zero. The normal distribution’s nature is continuous and infinite, so the probability of any single, exact value is infinitesimally small.
  2. In the left tail of a histogram, the normal distribution might not fit well. The normal distribution assumes perfect symmetry and tails that extend infinitely, while real data might have asymmetry or limited tail extensions. Therefore, in the left tail, the normal distribution might not represent the actual data distribution accurately, especially if there’s skewness or outliers present

C1

data("wage1")
attach(wage1)
model_wage <- lm( wage ~ educ + exper + tenure )
summary(model_wage)
## 
## Call:
## lm(formula = wage ~ educ + exper + tenure)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -7.6068 -1.7747 -0.6279  1.1969 14.6536 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept) -2.87273    0.72896  -3.941 9.22e-05 ***
## educ         0.59897    0.05128  11.679  < 2e-16 ***
## exper        0.02234    0.01206   1.853   0.0645 .  
## tenure       0.16927    0.02164   7.820 2.93e-14 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 3.084 on 522 degrees of freedom
## Multiple R-squared:  0.3064, Adjusted R-squared:  0.3024 
## F-statistic: 76.87 on 3 and 522 DF,  p-value: < 2.2e-16
hist(model_wage$residuals)

  1. wage = −2.87 + .599 educ + .022 exper + .169 tenure (0.73) (.051) (.012) (.022) n = 526, R^2 = .306, σ = 3.085. The 526 residual histogram, with i = 1, 2,…, 526, is shown above. The formula in the Stata manual suggests using 27 bins for 526 observations, which is what the histogram does. The normal distribution that fits the histogram the best is also presented for comparison.
model_lg_wage <- lm(log(wage)  ~ educ + exper + tenure)
summary(model_lg_wage)
## 
## Call:
## lm(formula = log(wage) ~ educ + exper + tenure)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -2.05802 -0.29645 -0.03265  0.28788  1.42809 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 0.284360   0.104190   2.729  0.00656 ** 
## educ        0.092029   0.007330  12.555  < 2e-16 ***
## exper       0.004121   0.001723   2.391  0.01714 *  
## tenure      0.022067   0.003094   7.133 3.29e-12 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.4409 on 522 degrees of freedom
## Multiple R-squared:  0.316,  Adjusted R-squared:  0.3121 
## F-statistic: 80.39 on 3 and 522 DF,  p-value: < 2.2e-16
hist(model_lg_wage$residuals)

  1. log(wage) = .284 + .092 educ + .0041 exper + .022 tenure (.104) (.007) (.0017) (.003) n = 526, R2 = .316, σ = .441. The histogram for the residuals from this equation, with the best-fitting normal distribution overlaid, is given above.

  2. The residuals derived from the log(wage) regression showcase a closer resemblance to a normal distribution. Specifically, when comparing the histograms between part (i) and part (ii), the histogram in part (ii) aligns more closely with the expected normal density. Besides, the histogram illustrating wage residuals displays a noticeable left skew. In the wage regression analysis, there exist certain exceptionally large residuals, approximately valued at 15, positioned nearly five estimated standard deviations (σ = 3.085) away from the mean of the residuals, which inherently is zero. However, these residuals far from zero seem to pose less of an issue in the log(wage) regression analysis.

CHAPTER 6

3: RDCHEM

model_rd6 <- lm(rdintens ~ sales + I(sales^2))
summary(model_rd6)
## 
## Call:
## lm(formula = rdintens ~ sales + I(sales^2))
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -2.1418 -1.3630 -0.2257  1.0688  5.5808 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  2.613e+00  4.294e-01   6.084 1.27e-06 ***
## sales        3.006e-04  1.393e-04   2.158   0.0394 *  
## I(sales^2)  -6.946e-09  3.726e-09  -1.864   0.0725 .  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.788 on 29 degrees of freedom
## Multiple R-squared:  0.1484, Adjusted R-squared:  0.08969 
## F-statistic: 2.527 on 2 and 29 DF,  p-value: 0.09733
  1. The marginal effect of sales on rdintens becomes negative when sales exceed approximately $214.3 million and also when sales surpass -0.4742839 .
  2. It should keep the quadratic term in the model for a few reasons:
  1. Non-linearity: The quadratic term can capture non-linear relationships between sales and rdintens that a linear model might miss.
  2. Improved Fit: It might improve the model’s fit to the data, especially if there’s evidence of curvature in the relationship.
salesbil <- sales/1000
model_salebil <- lm(rdintens~ salesbil+ I(salesbil^2))
summary(model_salebil)
## 
## Call:
## lm(formula = rdintens ~ salesbil + I(salesbil^2))
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -2.1418 -1.3630 -0.2257  1.0688  5.5808 
## 
## Coefficients:
##                Estimate Std. Error t value Pr(>|t|)    
## (Intercept)    2.612512   0.429442   6.084 1.27e-06 ***
## salesbil       0.300571   0.139295   2.158   0.0394 *  
## I(salesbil^2) -0.006946   0.003726  -1.864   0.0725 .  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.788 on 29 degrees of freedom
## Multiple R-squared:  0.1484, Adjusted R-squared:  0.08969 
## F-statistic: 2.527 on 2 and 29 DF,  p-value: 0.09733
  1. rdintens = 2.612 + 0.3006 salesbil + (-.007) salesbil^2 (.429) (.139) (.003) n = 32, R^2 = 0.089, σ = 1.788
  2. The second model would be prefered when we deduct a lot of decimals in the coefficient result. #### 10: MEAPSINGLE
  3. The equation, typically like math4 = 24.49 + 9.01 lexppp + 422 free + 752, is more relevant than the second because it directly estimates the relationship between math test performance and expenditures per student. This equation explicitly measures how changes in per-student spending impact math performance. With the provided equation, a 10% increase in per-student spending correlates with a 0.901 (9.01 in the equation) increase in math test performance.
  4. Introducing ‘read4’ might affect coefficients and statistical significance beyond ‘blexppp.’ It’s crucial to assess how this addition influences the model, potentially altering coefficients and significance.
  5. For someone with basic regression understanding, preferring the equation with a smaller adjusted R-squared can be explained by favoring simplicity. A smaller adjusted R-squared indicates fewer variables explaining the variation, simplifying the model while maintaining predictive power

C3: WAGE2

  1. Holding exper fixed, we have Δlog(wage)/Δeduc = β1 +β3 exper
  2. H0: β3 = 0. If we think that education and experience interact positively – so that people with more experience are more productive when given another year of education – then β3 > 0 is the appropriate alternative.
data("wage2")
model_wage2 <- lm(log(wage) ~ educ + exper + educ*exper, data = wage2)
summary(model_wage2)
## 
## Call:
## lm(formula = log(wage) ~ educ + exper + educ * exper, data = wage2)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -1.88558 -0.24553  0.03558  0.26171  1.28836 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  5.949455   0.240826  24.704   <2e-16 ***
## educ         0.044050   0.017391   2.533   0.0115 *  
## exper       -0.021496   0.019978  -1.076   0.2822    
## educ:exper   0.003203   0.001529   2.095   0.0365 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.3923 on 931 degrees of freedom
## Multiple R-squared:  0.1349, Adjusted R-squared:  0.1321 
## F-statistic: 48.41 on 3 and 931 DF,  p-value: < 2.2e-16
t <- 0.003203/0.001529
t
## [1] 2.094833
  1. The t statistic on the interaction term is about 2.13,which gives a p-value below .02 against H1: β3 > 0. Therefore, we reject H0: β3 = 0 against H1: β3 > 0 at the 2% level.

C12:

min(data4$age)
## [1] 25
data4 %>% filter(age==25) %>% count()
##    n
## 1 99
  1. The youngest age is 25, and there are 99 people of this age in the sample with fsize = 1
model_c12 <- lm(nettfa ~ inc + age +I(age^2), data = data4)
summary(model_c12)
## 
## Call:
## lm(formula = nettfa ~ inc + age + I(age^2), data = data4)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -179.36  -13.58   -2.97    5.67 1116.45 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept) -1.204212  15.280667  -0.079  0.93719    
## inc          0.824816   0.060298  13.679  < 2e-16 ***
## age         -1.321815   0.767496  -1.722  0.08518 .  
## I(age^2)     0.025562   0.008999   2.841  0.00455 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 44.6 on 2013 degrees of freedom
## Multiple R-squared:  0.1229, Adjusted R-squared:  0.1216 
## F-statistic: 93.99 on 3 and 2013 DF,  p-value: < 2.2e-16
  1. One literal interpretation is that β 2is the increase in nettfa when age increases by one year, holding fixed inc and age^2. But in this application, the partial effect starting at age = 0 is not interesting; the sample represents single people at least 25 years old.

  2. nettfa = −1.20 + .825 inc − 1.322 age + .0256 age2
    (15.28) (.060) (0.767) (.0090)
    n = 2,017, R2 = .1229 Initially, the negative coefficient on age may seem counterintuitive. The estimated relationship is a U-shape, but, to make sense of it, we need to find the turning point in the quadratic.

model_c12 <- lm(nettfa ~ inc + age + I((age-25)^2), data = data4)
summary(model_c12)
## 
## Call:
## lm(formula = nettfa ~ inc + age + I((age - 25)^2), data = data4)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -179.36  -13.58   -2.97    5.67 1116.45 
## 
## Coefficients:
##                   Estimate Std. Error t value Pr(>|t|)    
## (Intercept)     -17.180714   9.973145  -1.723  0.08510 .  
## inc               0.824816   0.060298  13.679  < 2e-16 ***
## age              -0.043695   0.325270  -0.134  0.89315    
## I((age - 25)^2)   0.025562   0.008999   2.841  0.00455 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 44.6 on 2013 degrees of freedom
## Multiple R-squared:  0.1229, Adjusted R-squared:  0.1216 
## F-statistic: 93.99 on 3 and 2013 DF,  p-value: < 2.2e-16

The result are : nettfa = −17.20 + .825 inc − .0437 age + .0256 (age −25)^2 (9.97) (.060) (.767) (.0090)
n = 2,017, R2 = .1229 Therefore, the estimated partial effect starting at age = 25 is only −.044 and very statistically insignificant (t = −.13). The two-sided p-value is about .89

  1. We exclude the variable age out of the model
model_c12 <- lm(nettfa ~ inc + I((age-25)^2), data = data4)
summary(model_c12)
## 
## Call:
## lm(formula = nettfa ~ inc + I((age - 25)^2), data = data4)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -179.37  -13.61   -3.01    5.63 1116.34 
## 
## Coefficients:
##                   Estimate Std. Error t value Pr(>|t|)    
## (Intercept)     -18.488105   2.177584  -8.490   <2e-16 ***
## inc               0.823571   0.059567  13.826   <2e-16 ***
## I((age - 25)^2)   0.024403   0.002541   9.605   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 44.59 on 2014 degrees of freedom
## Multiple R-squared:  0.1229, Adjusted R-squared:  0.122 
## F-statistic:   141 on 2 and 2014 DF,  p-value: < 2.2e-16

The adjusted R-squared is slightly higher when we drop age. But the real reason for dropping age is that its t statistic is quite small, and the model without it has a straightforward interpretation

# Generate x values
X_nettf<- seq(0, 50, length.out = 50)

# Compute corresponding y values using the equation
y_age <- 6.219 + 0.024403 * X_nettf^2

# Plot the curve
plot(X_nettf+25, y_age, type = "l", col = "yellow", lwd = 2, xlab = "Age", ylab = "Net Wealth", main = "Plot of relationship between nettfa and age")

The slope of the relationship between nnettfa and age is clearly increasing. That is, there is an increasing marginal effect. The model is constructed so that the slope is zero at age = 25; from there, the slope increases.