Call:
lm(formula = log_c ~ log_y + log_pl + log_pk + log_pf, data = df)
Residuals:
Min 1Q Median 3Q Max
-0.49179 -0.12897 -0.00924 0.09748 0.86660
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -8.54875 1.39458 -6.130 0.000000020478882 ***
log_y 0.82410 0.01309 62.981 < 0.0000000000000002 ***
log_pl 0.17789 0.14256 1.248 0.215
log_pk 0.20286 0.14457 1.403 0.164
log_pf 0.69325 0.08071 8.590 0.000000000000182 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 0.212 on 94 degrees of freedom
Multiple R-squared: 0.9798, Adjusted R-squared: 0.979
F-statistic: 1141 on 4 and 94 DF, p-value: < 0.00000000000000022
Q1.B
Consider production function (F is Fuel)
\[Y=AL^{\alpha_L}K^{\alpha_K}F^{\alpha_F}e^u\] We are told that \[\sum_j\alpha_j = r = 1/\beta_Y\]
Q1.B.1
Does the production function display a constant returns to scale? If \(r=1\leftrightarrow\beta_Y=1\).
Reminder, this is a similar procedure to single variable case: \[H_0:\beta_Y = 1\] which we test using the t statistic \[\frac{\hat{\beta}_Y-1}{\hat\sigma_{\hat{\beta}_Y}}=t\]
# Very small, so we reject H0.# How do get p-value / significance?degrees_freedom =summary(lm(log_c ~ log_y + log_pl + log_pk + log_pf, data = df))$df[2]pt(t, degrees_freedom)
[1] 0.000000000000000000000006486226
# very small p-value, reject H0 that beta_y is equal 1.# How to test automatically?mod =lm(log_c ~ log_y + log_pl + log_pk + log_pf, data = df)test = car::linearHypothesis(model = mod, "log_y = 1")test
Linear hypothesis test
Hypothesis:
log_y = 1
Model 1: restricted model
Model 2: log_c ~ log_y + log_pl + log_pk + log_pf
Res.Df RSS Df Sum of Sq F Pr(>F)
1 95 12.3427
2 94 4.2235 1 8.1192 180.71 < 0.00000000000000022 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# got an F statistic, can get the t using its rootsqrt(test$F[2])
[1] 13.44271
Q1.B.2
Is \(Cost(Y,P_L,P_K,P_F)\) a homogeneous function in the prices of degree 1?
## begin with manual calculation# coefficients estimationbeta_L = mod$coefficients[[3]] # two parentheses to avoid name recording in valuesbeta_K = mod$coefficients[[4]]beta_F = mod$coefficients[[5]]# variance covariance estimationcov_mat <-vcov(mod, data = df)[-1,][,-1]cov_mat
sigma_LL = cov_mat[2,2]sigma_KK = cov_mat[3,3]sigma_FF = cov_mat[4,4]sigma_LK = cov_mat[2,3]sigma_LF = cov_mat[2,4]sigma_KF = cov_mat[3,4]# now can calculate the t-statistics for the hypthesist = (beta_L + beta_K + beta_F -1)/((sigma_LL + sigma_KK + sigma_FF +2*(sigma_LK + sigma_LF + sigma_KF)))^0.5t
[1] 0.3581413
## Now for the automatic version, calculate an F-statistic and compute its squaretest = car::linearHypothesis(model = mod, "log_pl + log_pk + log_pf = 1")test
Linear hypothesis test
Hypothesis:
log_pl + log_pk + log_pf = 1
Model 1: restricted model
Model 2: log_c ~ log_y + log_pl + log_pk + log_pf
Res.Df RSS Df Sum of Sq F Pr(>F)
1 95 4.2292
2 94 4.2235 1 0.005763 0.1283 0.721
sqrt(test$F[2])
[1] 0.3581413
Q1.B.2 - F-test
Can write the hypothesis like so: \[\beta_F = 1-\beta_L-\beta_K\].
Insert into model to get the restricted model (denote with \(R\)) \[\begin{align}
log(Cost) &= \log(B) + \beta_Y\log(Y) + \beta_L\log(P_L) + \beta_K\log(P_K) + (1-\beta_L-\beta_K)\log(P_F) + \omega \\
&= \log(B) + \beta_Y\log(Y) + \beta_L[\log(P_L)-\log(P_F)] + \beta_K[\log(P_K)-\log(P_F)] + 1\log(P_F) + \omega
\end{align}\]
Calculate for model \(R\) and un-restricted model (denote with \(U\)) the following sum of squared errors \[ESS^R = \sum (\hat{\omega}^R)^2\quad\&\quad ESS^U = \sum (\hat{\omega}^U)^2\] And a corresponding F-statistic \[F=\frac{(ESS^R-ESS^U)/d}{ESS^U/(n-k-1)}\sim F_{d,n-k-1}\] where
\(d\) is number of constraints, and
\(k\) is number of variables, so \(n-k-1\) is number of degrees of freedom
Calculate F-statistic for this hypothesis
# un-restricted model and essmod_u = modess_u =sum(mod_u$residuals^2)# now do a restricted model estimationdf = df %>%mutate(log_c_minus_log_pf = log_c - log_pf,log_pl_minus_log_pf = log_pl - log_pf,log_pk_minus_log_pf = log_pk - log_pf )mod_r =lm(log_c_minus_log_pf ~ log_y + log_pl_minus_log_pf + log_pk_minus_log_pf, data = df)ess_r =sum(mod_r$residuals^2)# parametersd =1# 1 constraintn =nrow(df) # number of obsk =length(mod$coefficients) -1# number of variables is all coefficients minus interceptF_stat = ((ess_r - ess_u)/d)/(ess_u/(n-k-1))F_stat
[1] 0.1282652
Got a very low F-statistic. Can plot this out on the distribution graph
Or ask what statistical significance level, or p-value, does this correspond to
1-pf(F_stat, d, (n-k-1))
[1] 0.7210405
That is, we cannot reject \(H_0\) at conventional statistical levels.
Look above at the linearHypothesis command output, can see the F-statistic as well.
Q1.C
Now we have the following hypothesis: \[H_0: \sum_{i\in\{L,K,F\}}\beta_i = 1 \quad \& \quad \beta_Y = 1\] It follows from before that the restricted model is \[\begin{align}
log(Cost) &= \log(B) + 1\log(Y) + \beta_L\log(P_L) + \beta_K\log(P_K) + (1-\beta_L-\beta_K)\log(P_F) + \omega \\
&= \log(B) + 1\log(Y) + \beta_L[\log(P_L)-\log(P_F)] + \beta_K[\log(P_K)-\log(P_F)] + 1\log(P_F) + \omega
\end{align}\]
These are different hypotheses! Jointly, the coefficients explain the model well. But since they are correlated, alone they are not significant.
Q3
We are asked to preform a Chow-test.
Reminder: say we have a model, and two sub-populations.
We estimate the model in each sub-population separately.
We get 2 set of estimators.
Now we want to ask, are these sets equivalent?
This is the Chow-test.
We begin with a formal statement of the hypothesis.
Let there be two groups, \(P\) and \(G\).
Estimate the production function \(y = b_0 + b_zz+b_ll+b_kk+u\), separately by group.
Denote an estimator \(b_j^g\) as \(b_j\) when estimated using group \(g\).
This produces two sets of estimators: \(\{\hat{b}_0^P,\hat{b}_z^P,\hat{b}_l^P,\hat{b}_k^P\}\) and \(\{\hat{b}_0^G,\hat{b}_z^G,\hat{b}_l^G,\hat{b}_k^G\}\)
Want to test if same production function for \(P\) and \(G\), then hypothesis is \[H_0:b_i^P=b_i^G,\quad \forall i\in\{0,z,l,k\}\] The test for this is the following F-test: \[F=\frac{ESS_R-(ESS_P+ESS_G)/(k+1)}{(ESS_P+ESS_G)/(N_G+N_P-2k-2)}\sim F_{k+1,N_G+N_P-2k-2}\] Lets calculate this out of the table in the problem set.
# start with tsstss_p =308tss_g =240tss_r =650# continue with R^2r_sqr_p = .75r_sqr_g = .66r_sqr_r = .7
Note that \(R^2 = 1-ESS/TSS\leftrightarrow ESS = TSS\times(1-R^2)\)
This is equivalent to \[Y = \beta_0 + \beta_1X_1 + 1\times X_2 + u\]
Which is equivalent to the hypothesis: \[H_0:\beta_2 = 1\].
Since \(d=1\) (number of constraints), \(F=t^2\) and so \[t=\frac{\hat{\beta}_2-1}{Var(\hat{\beta}_2)}=\hat{\beta}_2-1=\sqrt{4}=\pm2\] Hence \(\hat{\beta}_2\in\{-1,3\}\).
Linear hypothesis test
Hypothesis:
age = 0
coll = 0
Model 1: restricted model
Model 2: lwage ~ points + exper + expersq + age + coll
Res.Df RSS Df Sum of Sq F Pr(>F)
1 265 107.59
2 263 106.63 2 0.96416 1.1891 0.3061
P-value of F-statistic of joint hypothesis is 0.3 (can reject \(H_0\) at \(\alpha = 30\%\)).
So no, not jointly significant after taking into account experience (and its square) and points scored.
Source Code
---title: "Undergrad Metrics - PS7"author: "Dor Leventer"date: "December 2022"format: html: theme: light: cosmo dark: darkly toc: true toc-location: left page-layout: article embed-resources: true code-tools: trueeditor: visual---```{r setup}#| include: false# markdown code optionknitr::opts_chunk$set(echo =TRUE)# clear enviormentrm(list=ls())# librarieslibrary(dplyr)library(ggplot2)library(car)# tell R not to use Scientific notationoptions(scipen=999)```# Q1Upload cost data, create variables```{r}insert_data_path <-"/Users/dorleventer/Dropbox/teaching/undergrad_econometrics_spring_2023/data"df =read.csv(glue::glue("{insert_data_path}/production.csv"))[-1] %>%mutate(log_pf =log(pf),log_pk =log(pk),log_pl =log(pl),log_y =log(y),log_c =log(cost) )df |>tibble()```## Q1.AOriginal equation.$$C=BY^{\beta_Y}P_L^{\beta_L}P_K^{\beta_K}P_F^{\beta_F}e^\omega$$Consider $log(C)$: \begin{align} \log(C) &= \log(BY^{\beta_Y}P_L^{\beta_L}P_K^{\beta_K}P_F^{\beta_F}e^\omega) \\ &= \log(B) + \log(Y^{\beta_Y}) + \log(P_L^{\beta_L}) + \log(P_K^{\beta_K}) + \log(P_F^{\beta_F}) + \log(e^\omega) \\ &= \log(B) + \beta_Y\log(Y) + \beta_L\log(P_L) + \beta_K\log(P_K) + \beta_F\log(P_F) + \omega\log(e)\end{align}Ended up with an equation that is linear in parameters. Estimate:```{r}summary(lm(log_c ~ log_y + log_pl + log_pk + log_pf, data = df))```## Q1.BConsider production function (F is Fuel)$$Y=AL^{\alpha_L}K^{\alpha_K}F^{\alpha_F}e^u$$ We are told that $$\sum_j\alpha_j = r = 1/\beta_Y$$### Q1.B.1Does the production function display a *constant returns to scale*? If $r=1\leftrightarrow\beta_Y=1$.Reminder, this is a similar procedure to single variable case: $$H_0:\beta_Y = 1$$ which we test using the t statistic $$\frac{\hat{\beta}_Y-1}{\hat\sigma_{\hat{\beta}_Y}}=t$$Test:```{r}# Manually:beta_y =summary(lm(log_c ~ log_y + log_pl + log_pk + log_pf, data = df))$coefficients[2,1]sigma_beta_y =summary(lm(log_c ~ log_y + log_pl + log_pk + log_pf, data = df))$coefficients[2,2]t = (beta_y-1)/sigma_beta_yt# Very small, so we reject H0.# How do get p-value / significance?degrees_freedom =summary(lm(log_c ~ log_y + log_pl + log_pk + log_pf, data = df))$df[2]pt(t, degrees_freedom)# very small p-value, reject H0 that beta_y is equal 1.# How to test automatically?mod =lm(log_c ~ log_y + log_pl + log_pk + log_pf, data = df)test = car::linearHypothesis(model = mod, "log_y = 1")test# got an F statistic, can get the t using its rootsqrt(test$F[2])```## Q1.B.2Is $Cost(Y,P_L,P_K,P_F)$ a homogeneous function in the prices of degree 1?This is true if```{=tex}\begin{align} Cost(Y,hP_L,hP_K,hP_F) &= h^1Cost(Y,P_L,P_K,P_F) \\ \leftrightarrow BY^{\beta_Y}(hP_L)^{\beta_L}(hP_K)^{\beta_K}(hP_F)^{\beta_F}e^\omega &= h^{\beta_L}h^{\beta_K}h^{\beta_F}BY^{\beta_Y}P_L^{\beta_L}P_K^{\beta_K}P_F^{\beta_F}e^\omega \\ &=h^{\sum_{i\in\{L,K,F\}}\beta_i}Cost(Y,P_L,P_K,P_F)\end{align}```So for the above to be true, it needs to hold that $$\sum_{i\in\{L,K,F\}}\beta_i = 1$$What hypothesis is this equivalent to? $$H_0: \sum_{i\in\{L,K,F\}}\beta_i - 1 = 0$$### Q1.B.2 - t-test- Calculation procedure in slidesWrite as$$t = \frac{\beta_L+\beta_K+\beta_F - 1}{\sqrt{Var(\beta_L+\beta_K+\beta_F)}}$$Whats in the denominator?$$Var(\beta_L+\beta_K+\beta_F - 1) = Var(\beta_L) + Var(\beta_K) + Var(\beta_F) + 2(Cov(\beta_L,\beta_K) + Cov(\beta_L,\beta_F) + Cov(\beta_K,\beta_F)$$Lets estimate this.```{r}## begin with manual calculation# coefficients estimationbeta_L = mod$coefficients[[3]] # two parentheses to avoid name recording in valuesbeta_K = mod$coefficients[[4]]beta_F = mod$coefficients[[5]]# variance covariance estimationcov_mat <-vcov(mod, data = df)[-1,][,-1]cov_matsigma_LL = cov_mat[2,2]sigma_KK = cov_mat[3,3]sigma_FF = cov_mat[4,4]sigma_LK = cov_mat[2,3]sigma_LF = cov_mat[2,4]sigma_KF = cov_mat[3,4]# now can calculate the t-statistics for the hypthesist = (beta_L + beta_K + beta_F -1)/((sigma_LL + sigma_KK + sigma_FF +2*(sigma_LK + sigma_LF + sigma_KF)))^0.5t## Now for the automatic version, calculate an F-statistic and compute its squaretest = car::linearHypothesis(model = mod, "log_pl + log_pk + log_pf = 1")testsqrt(test$F[2])```### Q1.B.2 - F-testCan write the hypothesis like so: $$\beta_F = 1-\beta_L-\beta_K$$.Insert into model to get the restricted model (denote with $R$) \begin{align} log(Cost) &= \log(B) + \beta_Y\log(Y) + \beta_L\log(P_L) + \beta_K\log(P_K) + (1-\beta_L-\beta_K)\log(P_F) + \omega \\ &= \log(B) + \beta_Y\log(Y) + \beta_L[\log(P_L)-\log(P_F)] + \beta_K[\log(P_K)-\log(P_F)] + 1\log(P_F) + \omega\end{align}Calculate for model $R$ and un-restricted model (denote with $U$) the following sum of squared errors $$ESS^R = \sum (\hat{\omega}^R)^2\quad\&\quad ESS^U = \sum (\hat{\omega}^U)^2$$ And a corresponding F-statistic $$F=\frac{(ESS^R-ESS^U)/d}{ESS^U/(n-k-1)}\sim F_{d,n-k-1}$$ where- $d$ is number of constraints, and- $k$ is number of variables, so $n-k-1$ is number of degrees of freedomCalculate F-statistic for this hypothesis```{r}# un-restricted model and essmod_u = modess_u =sum(mod_u$residuals^2)# now do a restricted model estimationdf = df %>%mutate(log_c_minus_log_pf = log_c - log_pf,log_pl_minus_log_pf = log_pl - log_pf,log_pk_minus_log_pf = log_pk - log_pf )mod_r =lm(log_c_minus_log_pf ~ log_y + log_pl_minus_log_pf + log_pk_minus_log_pf, data = df)ess_r =sum(mod_r$residuals^2)# parametersd =1# 1 constraintn =nrow(df) # number of obsk =length(mod$coefficients) -1# number of variables is all coefficients minus interceptF_stat = ((ess_r - ess_u)/d)/(ess_u/(n-k-1))F_stat```Got a very low F-statistic. Can plot this out on the distribution graph```{r}f_dist =data.frame(dist =rf(n=n, df1=d, df2=(n-k-1)))ggplot(data = f_dist, aes(x=dist)) +geom_density(color="blue") +geom_vline(xintercept = F_stat, color ="red", linetype ="dashed") +theme_classic() +scale_x_continuous(limits =c(0,10)) +labs(x="Value",y="Density")```Or ask what statistical significance level, or p-value, does this correspond to```{r}1-pf(F_stat, d, (n-k-1))```That is, we cannot reject $H_0$ at conventional statistical levels.Look above at the `linearHypothesis` command output, can see the F-statistic as well.## Q1.CNow we have the following hypothesis: $$H_0: \sum_{i\in\{L,K,F\}}\beta_i = 1 \quad \& \quad \beta_Y = 1$$ It follows from before that the restricted model is \begin{align} log(Cost) &= \log(B) + 1\log(Y) + \beta_L\log(P_L) + \beta_K\log(P_K) + (1-\beta_L-\beta_K)\log(P_F) + \omega \\ &= \log(B) + 1\log(Y) + \beta_L[\log(P_L)-\log(P_F)] + \beta_K[\log(P_K)-\log(P_F)] + 1\log(P_F) + \omega\end{align}Calculate F-statistic:```{r}car::linearHypothesis(model = mod, c("log_pl + log_pk + log_pf = 1","log_y = 1"))```## Q1.DPredict at the means of the explaining vars:```{r}df_pred =data.frame(log_pl =mean(df$log_pl),log_pk =mean(df$log_pk),log_pf =mean(df$log_pf),log_y =mean(df$log_y))predict(mod, df_pred)```# Q2## Q2.iHypothesis of zero coefficients separately:```{r}# Intermediate products0.56/0.5# Labor0.4/0.65# Capital 0.18/0.2```## Q2.iiHypothesis that all three coefficients are zero.Can use the F-statistic for this test$$F = \frac{R^2_U(n-k-1)}{(1-R^2_U)k} ~ F_{k,n - k - 1}$$```{r}(0.75*(38-3-1))/((1-0.75)*3)```with distribution $F_{3,34}$, this is significant with p-value```{r}F_stat = (0.75*(38-3-1))/((1-0.75)*3)d =3; n =38; k =31-pf(F_stat, d, (n-k-1))f_dist =data.frame(dist =rf(n=n, df1=d, df2=(n-k-1)))ggplot(data = f_dist, aes(x=dist)) +geom_density(color="blue") +geom_vline(xintercept = F_stat, color ="red", linetype ="dashed") +theme_classic() +scale_x_continuous(limits =c(0,40)) +labs(x="Value",y="Density")```Proof that this is an F Statistic. First note that$$R^2 = 1 - ESS/TSS \leftrightarrow ESS/TSS = 1-R^2$$So using the standard F statistic and some algebra\begin{align}F &= \frac{(ESS^R-ESS^U)/d}{ESS^U/(n-k-1)}\\ &= \frac{\frac{1}{TSS}(ESS^R-ESS^U)/d}{\frac{1}{TSS}ESS^U/(n-k-1)}\\ &= \frac{(\frac{ESS^R}{TSS}-\frac{ESS^U}{TSS})/d}{\frac{ESS^U}{TSS}/(n-k-1)}\\ &= \frac{(1-R^2_R-1+R^2_U)/d}{(1-R^2_U)/(n-k-1)}\end{align}Finally, note that if the hypothesis is that all coefficients are zero, and hence an intercept model, we get by construction $R^2_R = 0$. So\begin{align}F &= \frac{(1-1+R^2_U)/d}{(1-R^2_U)/(n-k-1)}\\ &= \frac{R^2_U/d}{(1-R^2_U)/(n-k-1)}\\ &= \frac{R^2_U \times (n-k-1)}{(1-R^2_U)\times d}\\\end{align}## Q2.iiiThese are different hypotheses! Jointly, the coefficients explain the model well. But since they are correlated, alone they are not significant.# Q3We are asked to preform a Chow-test.- Reminder: say we have a model, and two sub-populations.- We estimate the model in each sub-population *separately*.- We get 2 set of estimators.- Now we want to ask, are these sets equivalent?- This is the Chow-test.We begin with a formal statement of the hypothesis.- Let there be two groups, $P$ and $G$.- Estimate the production function $y = b_0 + b_zz+b_ll+b_kk+u$, separately by group.- Denote an estimator $b_j^g$ as $b_j$ when estimated using group $g$.- This produces two sets of estimators: $\{\hat{b}_0^P,\hat{b}_z^P,\hat{b}_l^P,\hat{b}_k^P\}$ and $\{\hat{b}_0^G,\hat{b}_z^G,\hat{b}_l^G,\hat{b}_k^G\}$- Want to test if same production function for $P$ and $G$, then hypothesis is $$H_0:b_i^P=b_i^G,\quad \forall i\in\{0,z,l,k\}$$ The test for this is the following F-test: $$F=\frac{ESS_R-(ESS_P+ESS_G)/(k+1)}{(ESS_P+ESS_G)/(N_G+N_P-2k-2)}\sim F_{k+1,N_G+N_P-2k-2}$$ Lets calculate this out of the table in the problem set.```{r}# start with tsstss_p =308tss_g =240tss_r =650# continue with R^2r_sqr_p = .75r_sqr_g = .66r_sqr_r = .7```Note that $R^2 = 1-ESS/TSS\leftrightarrow ESS = TSS\times(1-R^2)$```{r}# so calculate essess_p = tss_p * (1-r_sqr_p)ess_g = tss_g * (1-r_sqr_g)ess_r = tss_r * (1-r_sqr_r)# input other parametersn_p =38n_g =18k =3# calculate F-statisticF_stat = ((ess_r - (ess_g + ess_p))/(k+1))/((ess_p + ess_g)/(n_g + n_p -2*k -2))F_stat# calculate p-value1-pf(F_stat, (k+1), (n_g + n_p -2*k -2))```So can reject $H_0$ (both $P$ and $G$ do not have same production function) at 5%.# Q4Assume $F=4$ and $Var(\hat{\beta}_2)=1$.- Un-restricted model: $$Y = \beta_0 + \beta_1X_1 + \beta_2X_2 + u$$- Restricted model: $$Y - X_2 = \beta_0 + \beta_1X_1 + u$$- This is equivalent to $$Y = \beta_0 + \beta_1X_1 + 1\times X_2 + u$$- Which is equivalent to the hypothesis: $$H_0:\beta_2 = 1$$.Since $d=1$ (number of constraints), $F=t^2$ and so $$t=\frac{\hat{\beta}_2-1}{Var(\hat{\beta}_2)}=\hat{\beta}_2-1=\sqrt{4}=\pm2$$ Hence $\hat{\beta}_2\in\{-1,3\}$.# Q5Upload NBA data```{r}df <-read.csv(glue::glue("{insert_data_path}/nbasalraw.csv"))[-1]df |>tibble()```## Q5.1Estimate model```{r}summary(lm(points ~ exper + expersq + age + coll, data = df))```## Q5.2Asked about maximum value of experience.- From last Tirgul, equivalent to $$\frac{\partial points}{\partial exper} = \beta_{exper} + 2\beta_{expersq}exper$$- With maximum at $$\beta_{exper} + 2\beta_{expersq}exper = 0 \leftrightarrow exper = \frac{-\beta_{exper}}{2\beta_{expersq}}$$```{r}mod =summary(lm(points ~ exper + expersq + age + coll, data = df))beta_exper = mod$coefficients[2,1]beta_expersq = mod$coefficients[3,1]max_exper = (-beta_exper)/(2*beta_expersq)max_exper```Is this reasonable?- I would have thought that much sooner...- (if start at 20, best game at 35?)## Q5.3Why is estimator of `coll` negative?- If possible to recruit before college,- Who is recreuited at high-school?- Best players!- Possible story why players recruited at college score less.## Q5.4Add age squared```{r}mod =summary(lm(points ~ exper + expersq + age + agesq + coll, data = df))mod```Age squared isnt different from zero. So why add it?```{r}beta_exper = mod$coefficients[2,1]beta_expersq = mod$coefficients[3,1]max_exper = (-beta_exper)/(2*beta_expersq)max_exper```Got a slightly more reasonable estimate for maximum effect of experience, accounting for age squared.## Q5.5Predict log earnings.```{r}mod =lm(lwage ~ points + exper + expersq + age + coll, data = df)mod```## Q5.6Do `age` and `col` matter?- Looking at separate t-tests, no. Cannot reject that estimators are different from zero.- Are both of them different from zero? $$H_0:\beta_{age}=\beta{coll} = 0$$```{r}linearHypothesis(model = mod,c("age = 0", "coll = 0"))```P-value of F-statistic of joint hypothesis is 0.3 (can reject $H_0$ at $\alpha = 30\%$).- So no, not jointly significant after taking into account experience (and its square) and points scored.