Section-1 Creating Our First Model

【1.1】 Enter the model R2 (the “Multiple R-squared” value):

Ans: 0.7509
Before2006 = subset(cc, cc$Year <= 2006)
After2006 = subset(cc, cc$Year > 2006)
model1 = lm(Temp~ MEI + CO2 + CH4 + N2O + CFC.11 + CFC.12 + TSI + Aerosols, data = Before2006)
summary(model1)

Call:
lm(formula = Temp ~ MEI + CO2 + CH4 + N2O + CFC.11 + CFC.12 + 
    TSI + Aerosols, data = Before2006)

Residuals:
     Min       1Q   Median       3Q      Max 
-0.25888 -0.05913 -0.00082  0.05649  0.32433 

Coefficients:
              Estimate Std. Error t value Pr(>|t|)    
(Intercept) -1.246e+02  1.989e+01  -6.265 1.43e-09 ***
MEI          6.421e-02  6.470e-03   9.923  < 2e-16 ***
CO2          6.457e-03  2.285e-03   2.826  0.00505 ** 
CH4          1.240e-04  5.158e-04   0.240  0.81015    
N2O         -1.653e-02  8.565e-03  -1.930  0.05467 .  
CFC.11      -6.631e-03  1.626e-03  -4.078 5.96e-05 ***
CFC.12       3.808e-03  1.014e-03   3.757  0.00021 ***
TSI          9.314e-02  1.475e-02   6.313 1.10e-09 ***
Aerosols    -1.538e+00  2.133e-01  -7.210 5.41e-12 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.09171 on 275 degrees of freedom
Multiple R-squared:  0.7509,    Adjusted R-squared:  0.7436 
F-statistic: 103.6 on 8 and 275 DF,  p-value: < 2.2e-16

【1.2】 Which variables are significant in the model? We will consider a variable signficant only if the p-value is below 0.05. (Select all that apply.)

Ans: MEI, CO2, CFC.11, CFC.12, TSI, and Aerosols.




Section-2 Understanding the Model

【2.1】 Which of the following is the simplest correct explanation for this contradiction?

Ans: All of the gas concentration variables reflect human development - N2O and CFC.11 are correlated with other variables in the data set.
cor(Before2006$N2O, Before2006$CFC.11)
[1] 0.5224773

【2.2】 Compute the correlations between all the variables in the training set. Which of the following independent variables is N2O highly correlated with (absolute correlation greater than 0.7)? Select all that apply.

Ans: CO2, CH4, CFC.12, and Temp.
cor(Before2006)
                Year         Month           MEI         CO2         CH4
Year      1.00000000 -0.0279419602 -0.0369876842  0.98274939  0.91565945
Month    -0.02794196  1.0000000000  0.0008846905 -0.10673246  0.01856866
MEI      -0.03698768  0.0008846905  1.0000000000 -0.04114717 -0.03341930
CO2       0.98274939 -0.1067324607 -0.0411471651  1.00000000  0.87727963
CH4       0.91565945  0.0185686624 -0.0334193014  0.87727963  1.00000000
N2O       0.99384523  0.0136315303 -0.0508197755  0.97671982  0.89983864
CFC.11    0.56910643 -0.0131112236  0.0690004387  0.51405975  0.77990402
CFC.12    0.89701166  0.0006751102  0.0082855443  0.85268963  0.96361625
TSI       0.17030201 -0.0346061935 -0.1544919227  0.17742893  0.24552844
Aerosols -0.34524670  0.0148895406  0.3402377871 -0.35615480 -0.26780919
Temp      0.78679714 -0.0998567411  0.1724707512  0.78852921  0.70325502
                 N2O      CFC.11        CFC.12         TSI    Aerosols
Year      0.99384523  0.56910643  0.8970116635  0.17030201 -0.34524670
Month     0.01363153 -0.01311122  0.0006751102 -0.03460619  0.01488954
MEI      -0.05081978  0.06900044  0.0082855443 -0.15449192  0.34023779
CO2       0.97671982  0.51405975  0.8526896272  0.17742893 -0.35615480
CH4       0.89983864  0.77990402  0.9636162478  0.24552844 -0.26780919
N2O       1.00000000  0.52247732  0.8679307757  0.19975668 -0.33705457
CFC.11    0.52247732  1.00000000  0.8689851828  0.27204596 -0.04392120
CFC.12    0.86793078  0.86898518  1.0000000000  0.25530281 -0.22513124
TSI       0.19975668  0.27204596  0.2553028138  1.00000000  0.05211651
Aerosols -0.33705457 -0.04392120 -0.2251312440  0.05211651  1.00000000
Temp      0.77863893  0.40771029  0.6875575483  0.24338269 -0.38491375
                Temp
Year      0.78679714
Month    -0.09985674
MEI       0.17247075
CO2       0.78852921
CH4       0.70325502
N2O       0.77863893
CFC.11    0.40771029
CFC.12    0.68755755
TSI       0.24338269
Aerosols -0.38491375
Temp      1.00000000

【2.3】 Which of the following independent variables is CFC.11 highly correlated with? Select all that apply.

Ans: CH4 and CFC.12.




Section-3 Simplifying the Model

【3.1】 Enter the coefficient of N2O in this reduced model:

Ans: 0.02532
summary(model2)

Call:
lm(formula = Temp ~ MEI + N2O + TSI + Aerosols, data = Before2006)

Residuals:
     Min       1Q   Median       3Q      Max 
-0.27916 -0.05975 -0.00595  0.05672  0.34195 

Coefficients:
              Estimate Std. Error t value Pr(>|t|)    
(Intercept) -1.162e+02  2.022e+01  -5.747 2.37e-08 ***
MEI          6.419e-02  6.652e-03   9.649  < 2e-16 ***
N2O          2.532e-02  1.311e-03  19.307  < 2e-16 ***
TSI          7.949e-02  1.487e-02   5.344 1.89e-07 ***
Aerosols    -1.702e+00  2.180e-01  -7.806 1.19e-13 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.09547 on 279 degrees of freedom
Multiple R-squared:  0.7261,    Adjusted R-squared:  0.7222 
F-statistic: 184.9 on 4 and 279 DF,  p-value: < 2.2e-16

【3.2】 (How does this compare to the coefficient in the previous model with all of the variables?)

Enter the model R2:

Ans: 0.7261




Section-4 Automatically Building the Model

【4.1】 Enter the R2 value of the model produced by the step function:

Ans: 0.7508
Bestmodel = step(model1)
Start:  AIC=-1348.16
Temp ~ MEI + CO2 + CH4 + N2O + CFC.11 + CFC.12 + TSI + Aerosols

           Df Sum of Sq    RSS     AIC
- CH4       1   0.00049 2.3135 -1350.1
<none>                  2.3130 -1348.2
- N2O       1   0.03132 2.3443 -1346.3
- CO2       1   0.06719 2.3802 -1342.0
- CFC.12    1   0.11874 2.4318 -1335.9
- CFC.11    1   0.13986 2.4529 -1333.5
- TSI       1   0.33516 2.6482 -1311.7
- Aerosols  1   0.43727 2.7503 -1301.0
- MEI       1   0.82823 3.1412 -1263.2

Step:  AIC=-1350.1
Temp ~ MEI + CO2 + N2O + CFC.11 + CFC.12 + TSI + Aerosols

           Df Sum of Sq    RSS     AIC
<none>                  2.3135 -1350.1
- N2O       1   0.03133 2.3448 -1348.3
- CO2       1   0.06672 2.3802 -1344.0
- CFC.12    1   0.13023 2.4437 -1336.5
- CFC.11    1   0.13938 2.4529 -1335.5
- TSI       1   0.33500 2.6485 -1313.7
- Aerosols  1   0.43987 2.7534 -1302.7
- MEI       1   0.83118 3.1447 -1264.9
summary(Bestmodel)

Call:
lm(formula = Temp ~ MEI + CO2 + N2O + CFC.11 + CFC.12 + TSI + 
    Aerosols, data = Before2006)

Residuals:
     Min       1Q   Median       3Q      Max 
-0.25770 -0.05994 -0.00104  0.05588  0.32203 

Coefficients:
              Estimate Std. Error t value Pr(>|t|)    
(Intercept) -1.245e+02  1.985e+01  -6.273 1.37e-09 ***
MEI          6.407e-02  6.434e-03   9.958  < 2e-16 ***
CO2          6.402e-03  2.269e-03   2.821 0.005129 ** 
N2O         -1.602e-02  8.287e-03  -1.933 0.054234 .  
CFC.11      -6.609e-03  1.621e-03  -4.078 5.95e-05 ***
CFC.12       3.868e-03  9.812e-04   3.942 0.000103 ***
TSI          9.312e-02  1.473e-02   6.322 1.04e-09 ***
Aerosols    -1.540e+00  2.126e-01  -7.244 4.36e-12 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.09155 on 276 degrees of freedom
Multiple R-squared:  0.7508,    Adjusted R-squared:  0.7445 
F-statistic: 118.8 on 7 and 276 DF,  p-value: < 2.2e-16

【4.2】 Which of the following variable(s) were eliminated from the full model by the step function? Select all that apply.

Ans: CH4




Section-5 Testing on Unseen Data

【5.1】 Using the model produced from the step function, calculate temperature predictions for the testing data set, using the predict function.

Enter the testing set R2:

Ans: 0.6286051
predict1 = predict(Bestmodel, newdata = After2006)
SSE = sum((After2006$Temp - predict1)^2)
SST = sum((After2006$Temp - mean(Before2006$Temp))^2)
R2 = 1 - SSE/SST
print(R2)
[1] 0.6286051
LS0tDQp0aXRsZTogIkFTMi0xOiBDbGltYXRlIENoYW5nZSINCmF1dGhvcjogIuWUkOaAneeQqiBCMDQxMDEwMDA0IDIwMTgvMDcvMDkiDQpvdXRwdXQ6IGh0bWxfbm90ZWJvb2sNCi0tLQ0KDQotIC0gLQ0KDQojIyMjICBTZWN0aW9uLTEgQ3JlYXRpbmcgT3VyIEZpcnN0IE1vZGVsDQoNCuOAkDEuMeOAkSBFbnRlciB0aGUgbW9kZWwgUjIgKHRoZSAiTXVsdGlwbGUgUi1zcXVhcmVkIiB2YWx1ZSk6DQoNCiMjIyMjI0FuczogIDAuNzUwOQ0KDQoNCmBgYHtyfQ0KQmVmb3JlMjAwNiA9IHN1YnNldChjYywgY2MkWWVhciA8PSAyMDA2KQ0KQWZ0ZXIyMDA2ID0gc3Vic2V0KGNjLCBjYyRZZWFyID4gMjAwNikNCm1vZGVsMSA9IGxtKFRlbXB+IE1FSSArIENPMiArIENINCArIE4yTyArIENGQy4xMSArIENGQy4xMiArIFRTSSArIEFlcm9zb2xzLCBkYXRhID0gQmVmb3JlMjAwNikNCnN1bW1hcnkobW9kZWwxKQ0KYGBgDQoNCg0K44CQMS4y44CRIFdoaWNoIHZhcmlhYmxlcyBhcmUgc2lnbmlmaWNhbnQgaW4gdGhlIG1vZGVsPyBXZSB3aWxsIGNvbnNpZGVyIGEgdmFyaWFibGUgc2lnbmZpY2FudCBvbmx5IGlmIHRoZSBwLXZhbHVlIGlzIGJlbG93IDAuMDUuIChTZWxlY3QgYWxsIHRoYXQgYXBwbHkuKQ0KDQojIyMjIyNBbnM6ICBNRUksIENPMiwgQ0ZDLjExLCBDRkMuMTIsIFRTSSwgYW5kIEFlcm9zb2xzLg0KDQo8YnI+PGJyPg0KDQotIC0gLQ0KDQojIyMjIFNlY3Rpb24tMiBVbmRlcnN0YW5kaW5nIHRoZSBNb2RlbA0KDQrjgJAyLjHjgJEgV2hpY2ggb2YgdGhlIGZvbGxvd2luZyBpcyB0aGUgc2ltcGxlc3QgY29ycmVjdCBleHBsYW5hdGlvbiBmb3IgdGhpcyBjb250cmFkaWN0aW9uPw0KDQorIENsaW1hdGUgc2NpZW50aXN0cyBhcmUgd3JvbmcgdGhhdCBOMk8gYW5kIENGQy0xMSBhcmUgZ3JlZW5ob3VzZSBnYXNlcyAtIHRoaXMgcmVncmVzc2lvbiBhbmFseXNpcyBjb25zdGl0dXRlcyBwYXJ0IG9mIGEgZGlzcHJvb2YuDQorIFRoZXJlIGlzIG5vdCBlbm91Z2ggZGF0YSwgc28gdGhlIHJlZ3Jlc3Npb24gY29lZmZpY2llbnRzIGJlaW5nIGVzdGltYXRlZCBhcmUgbm90IGFjY3VyYXRlLg0KKyBBbGwgb2YgdGhlIGdhcyBjb25jZW50cmF0aW9uIHZhcmlhYmxlcyByZWZsZWN0IGh1bWFuIGRldmVsb3BtZW50IC0gTjJPIGFuZCBDRkMuMTEgYXJlIGNvcnJlbGF0ZWQgd2l0aCBvdGhlciB2YXJpYWJsZXMgaW4gdGhlIGRhdGEgc2V0Lg0KDQojIyMjIyNBbnM6ICBBbGwgb2YgdGhlIGdhcyBjb25jZW50cmF0aW9uIHZhcmlhYmxlcyByZWZsZWN0IGh1bWFuIGRldmVsb3BtZW50IC0gTjJPIGFuZCBDRkMuMTEgYXJlIGNvcnJlbGF0ZWQgd2l0aCBvdGhlciB2YXJpYWJsZXMgaW4gdGhlIGRhdGEgc2V0Lg0KDQpgYGB7cn0NCmNvcihCZWZvcmUyMDA2JE4yTywgQmVmb3JlMjAwNiRDRkMuMTEpDQpgYGANCg0K44CQMi4y44CRIENvbXB1dGUgdGhlIGNvcnJlbGF0aW9ucyBiZXR3ZWVuIGFsbCB0aGUgdmFyaWFibGVzIGluIHRoZSB0cmFpbmluZyBzZXQuIFdoaWNoIG9mIHRoZSBmb2xsb3dpbmcgaW5kZXBlbmRlbnQgdmFyaWFibGVzIGlzIE4yTyBoaWdobHkgY29ycmVsYXRlZCB3aXRoIChhYnNvbHV0ZSBjb3JyZWxhdGlvbiBncmVhdGVyIHRoYW4gMC43KT8gU2VsZWN0IGFsbCB0aGF0IGFwcGx5Lg0KDQojIyMjIyNBbnM6IENPMiwgQ0g0LCBDRkMuMTIsIGFuZCBUZW1wLg0KDQpgYGB7cn0NCmNvcihCZWZvcmUyMDA2KQ0KYGBgDQoNCuOAkDIuM+OAkSBXaGljaCBvZiB0aGUgZm9sbG93aW5nIGluZGVwZW5kZW50IHZhcmlhYmxlcyBpcyBDRkMuMTEgaGlnaGx5IGNvcnJlbGF0ZWQgd2l0aD8gU2VsZWN0IGFsbCB0aGF0IGFwcGx5Lg0KDQojIyMjIyNBbnM6IENINCBhbmQgQ0ZDLjEyLg0KDQoNCjxicj48YnI+DQoNCi0gLSAtDQoNCiMjIyMgU2VjdGlvbi0zIFNpbXBsaWZ5aW5nIHRoZSBNb2RlbA0KDQrjgJAzLjHjgJEgRW50ZXIgdGhlIGNvZWZmaWNpZW50IG9mIE4yTyBpbiB0aGlzIHJlZHVjZWQgbW9kZWw6DQoNCiMjIyMjI0FuczogMC4wMjUzMg0KDQpgYGB7cn0NCm1vZGVsMiA9IGxtKFRlbXB+IE1FSSArIE4yTyArIFRTSSArIEFlcm9zb2xzLCBkYXRhID0gQmVmb3JlMjAwNikNCnN1bW1hcnkobW9kZWwyKQ0KYGBgDQoNCuOAkDMuMuOAkSAoSG93IGRvZXMgdGhpcyBjb21wYXJlIHRvIHRoZSBjb2VmZmljaWVudCBpbiB0aGUgcHJldmlvdXMgbW9kZWwgd2l0aCBhbGwgb2YgdGhlIHZhcmlhYmxlcz8pDQoNCkVudGVyIHRoZSBtb2RlbCBSMjoNCg0KIyMjIyMjQW5zOiAwLjcyNjENCg0KDQo8YnI+PGJyPg0KDQotIC0gLQ0KDQojIyMjIFNlY3Rpb24tNCBBdXRvbWF0aWNhbGx5IEJ1aWxkaW5nIHRoZSBNb2RlbA0KDQrjgJA0LjHjgJEgRW50ZXIgdGhlIFIyIHZhbHVlIG9mIHRoZSBtb2RlbCBwcm9kdWNlZCBieSB0aGUgc3RlcCBmdW5jdGlvbjoNCg0KIyMjIyMjQW5zOiAwLjc1MDgNCg0KYGBge3J9DQpCZXN0bW9kZWwgPSBzdGVwKG1vZGVsMSkNCnN1bW1hcnkoQmVzdG1vZGVsKQ0KYGBgDQoNCuOAkDQuMuOAkSBXaGljaCBvZiB0aGUgZm9sbG93aW5nIHZhcmlhYmxlKHMpIHdlcmUgZWxpbWluYXRlZCBmcm9tIHRoZSBmdWxsIG1vZGVsIGJ5IHRoZSBzdGVwIGZ1bmN0aW9uPyBTZWxlY3QgYWxsIHRoYXQgYXBwbHkuDQoNCiMjIyMjI0FuczogQ0g0DQoNCjxicj48YnI+DQoNCi0gLSAtDQoNCiMjIyMgU2VjdGlvbi01IFRlc3Rpbmcgb24gVW5zZWVuIERhdGENCuOAkDUuMeOAkSBVc2luZyB0aGUgbW9kZWwgcHJvZHVjZWQgZnJvbSB0aGUgc3RlcCBmdW5jdGlvbiwgY2FsY3VsYXRlIHRlbXBlcmF0dXJlIHByZWRpY3Rpb25zIGZvciB0aGUgdGVzdGluZyBkYXRhIHNldCwgdXNpbmcgdGhlIHByZWRpY3QgZnVuY3Rpb24uDQoNCkVudGVyIHRoZSB0ZXN0aW5nIHNldCBSMjoNCg0KIyMjIyMjQW5zOiAwLjYyODYwNTENCg0KYGBge3J9DQpwcmVkaWN0MSA9IHByZWRpY3QoQmVzdG1vZGVsLCBuZXdkYXRhID0gQWZ0ZXIyMDA2KQ0KU1NFID0gc3VtKChBZnRlcjIwMDYkVGVtcCAtIHByZWRpY3QxKV4yKQ0KU1NUID0gc3VtKChBZnRlcjIwMDYkVGVtcCAtIG1lYW4oQmVmb3JlMjAwNiRUZW1wKSleMikNClIyID0gMSAtIFNTRS9TU1QNCnByaW50KFIyKQ0KYGBgDQoNCg==