There have been many studies documenting that the average global temperature has been increasing over the last century. The consequences of a continued rise in global temperature will be dire. Rising sea levels and an increased frequency of extreme weather events will affect billions of people.
In this problem, we will attempt to study the relationship between average global temperature and several other factors.
The file climate_change.csv contains climate data from May 1983 to December 2008. The available variables include:
Year: the observation year.
Month: the observation month.
Temp: the difference in degrees Celsius between the average global temperature in that period and a reference value. This data comes from the Climatic Research Unit at the University of East Anglia.
CO2, N2O, CH4, CFC.11, CFC.12: atmospheric concentrations of carbon dioxide (CO2), nitrous oxide (N2O), methane (CH4), trichlorofluoromethane (CCl3F; commonly referred to as CFC-11) and dichlorodifluoromethane (CCl2F2; commonly referred to as CFC-12), respectively. This data comes from the ESRL/NOAA Global Monitoring Division.
CO2, N2O and CH4 are expressed in ppmv (parts per million by volume -- i.e., 397 ppmv of CO2 means that CO2 constitutes 397 millionths of the total volume of the atmosphere)
CFC.11 and CFC.12 are expressed in ppbv (parts per billion by volume).
Aerosols: the mean stratospheric aerosol optical depth at 550 nm. This variable is linked to volcanoes, as volcanic eruptions result in new particles being added to the atmosphere, which affect how much of the sun's energy is reflected back into space. This data is from the Godard Institute for Space Studies at NASA.
TSI: the total solar irradiance (TSI) in W/m2 (the rate at which the sun's energy is deposited per unit area). Due to sunspots and other solar phenomena, the amount of energy that is given off by the sun varies substantially with time. This data is from the SOLARIS-HEPPA project website.
MEI: multivariate El Nino Southern Oscillation index (MEI), a measure of the strength of the El Nino/La Nina-Southern Oscillation (a weather effect in the Pacific Ocean that affects global temperatures). This data comes from the ESRL/NOAA Physical Sciences Division.
clim_change <- read.csv("climate_change.csv")
str(clim_change)
## 'data.frame': 308 obs. of 11 variables:
## $ Year : int 1983 1983 1983 1983 1983 1983 1983 1983 1984 1984 ...
## $ Month : int 5 6 7 8 9 10 11 12 1 2 ...
## $ MEI : num 2.556 2.167 1.741 1.13 0.428 ...
## $ CO2 : num 346 346 344 342 340 ...
## $ CH4 : num 1639 1634 1633 1631 1648 ...
## $ N2O : num 304 304 304 304 304 ...
## $ CFC.11 : num 191 192 193 194 194 ...
## $ CFC.12 : num 350 352 354 356 357 ...
## $ TSI : num 1366 1366 1366 1366 1366 ...
## $ Aerosols: num 0.0863 0.0794 0.0731 0.0673 0.0619 0.0569 0.0524 0.0486 0.0451 0.0416 ...
## $ Temp : num 0.109 0.118 0.137 0.176 0.149 0.093 0.232 0.078 0.089 0.013 ...
Let’s make them separated. We are interested in how changes in these variables affect future temperatures, as well as how well these variables explain temperature changes so far. To do this, first read the dataset climate_change.csv into R.
Then, split the data into a training set, consisting of all the observations up to and including 2006, and a testing set consisting of the remaining years (hint: use subset). A training set refers to the data that will be used to build the model (this is the data we give to the lm() function), and a testing set refers to the data we will use to test our predictive ability.
Next, build a linear regression model to predict the dependent variable Temp, using MEI, CO2, CH4, N2O, CFC.11, CFC.12, TSI, and Aerosols as independent variables (Year and Month should NOT be used in the model). Use the training set to build the model.
Enter the model R2 (the “Multiple R-squared” value):
train_cc <- subset(clim_change, Year <=2006)
test_cc <- subset(clim_change, Year > 2006)
RegTemp <- lm(Temp ~ MEI + CO2 + CH4 + N2O + CFC.11 + CFC.12 + TSI + Aerosols,
data = train_cc)
summary(RegTemp)
##
## Call:
## lm(formula = Temp ~ MEI + CO2 + CH4 + N2O + CFC.11 + CFC.12 +
## TSI + Aerosols, data = train_cc)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.25888 -0.05913 -0.00082 0.05649 0.32433
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -1.246e+02 1.989e+01 -6.265 1.43e-09 ***
## MEI 6.421e-02 6.470e-03 9.923 < 2e-16 ***
## CO2 6.457e-03 2.285e-03 2.826 0.00505 **
## CH4 1.240e-04 5.158e-04 0.240 0.81015
## N2O -1.653e-02 8.565e-03 -1.930 0.05467 .
## CFC.11 -6.631e-03 1.626e-03 -4.078 5.96e-05 ***
## CFC.12 3.808e-03 1.014e-03 3.757 0.00021 ***
## TSI 9.314e-02 1.475e-02 6.313 1.10e-09 ***
## Aerosols -1.538e+00 2.133e-01 -7.210 5.41e-12 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.09171 on 275 degrees of freedom
## Multiple R-squared: 0.7509, Adjusted R-squared: 0.7436
## F-statistic: 103.6 on 8 and 275 DF, p-value: < 2.2e-16
Current scientific opinion is that nitrous oxide and CFC-11 are greenhouse gases: gases that are able to trap heat from the sun and contribute to the heating of the Earth. However, the regression coefficients of both the N2O and CFC-11 variables are negative, indicating that increasing atmospheric concentrations of either of these two compounds is associated with lower global temperatures.
Which of the following is the simplest correct explanation for this contradiction? - All variables are correlated with the others :)
Compute the correlations between all the variables in the training set. Which of the following independent variables is N2O highly correlated with (absolute correlation greater than 0.7)? Select all that apply.
cor(train_cc)
## Year Month MEI CO2 CH4
## Year 1.00000000 -0.0279419602 -0.0369876842 0.98274939 0.91565945
## Month -0.02794196 1.0000000000 0.0008846905 -0.10673246 0.01856866
## MEI -0.03698768 0.0008846905 1.0000000000 -0.04114717 -0.03341930
## CO2 0.98274939 -0.1067324607 -0.0411471651 1.00000000 0.87727963
## CH4 0.91565945 0.0185686624 -0.0334193014 0.87727963 1.00000000
## N2O 0.99384523 0.0136315303 -0.0508197755 0.97671982 0.89983864
## CFC.11 0.56910643 -0.0131112236 0.0690004387 0.51405975 0.77990402
## CFC.12 0.89701166 0.0006751102 0.0082855443 0.85268963 0.96361625
## TSI 0.17030201 -0.0346061935 -0.1544919227 0.17742893 0.24552844
## Aerosols -0.34524670 0.0148895406 0.3402377871 -0.35615480 -0.26780919
## Temp 0.78679714 -0.0998567411 0.1724707512 0.78852921 0.70325502
## N2O CFC.11 CFC.12 TSI Aerosols
## Year 0.99384523 0.56910643 0.8970116635 0.17030201 -0.34524670
## Month 0.01363153 -0.01311122 0.0006751102 -0.03460619 0.01488954
## MEI -0.05081978 0.06900044 0.0082855443 -0.15449192 0.34023779
## CO2 0.97671982 0.51405975 0.8526896272 0.17742893 -0.35615480
## CH4 0.89983864 0.77990402 0.9636162478 0.24552844 -0.26780919
## N2O 1.00000000 0.52247732 0.8679307757 0.19975668 -0.33705457
## CFC.11 0.52247732 1.00000000 0.8689851828 0.27204596 -0.04392120
## CFC.12 0.86793078 0.86898518 1.0000000000 0.25530281 -0.22513124
## TSI 0.19975668 0.27204596 0.2553028138 1.00000000 0.05211651
## Aerosols -0.33705457 -0.04392120 -0.2251312440 0.05211651 1.00000000
## Temp 0.77863893 0.40771029 0.6875575483 0.24338269 -0.38491375
## Temp
## Year 0.78679714
## Month -0.09985674
## MEI 0.17247075
## CO2 0.78852921
## CH4 0.70325502
## N2O 0.77863893
## CFC.11 0.40771029
## CFC.12 0.68755755
## TSI 0.24338269
## Aerosols -0.38491375
## Temp 1.00000000
Given that the correlations are so high, let us focus on the N2O variable and build a model with only MEI, TSI, Aerosols and N2O as independent variables. Remember to use the training set to build the model.
Enter the coefficient of N2O in this reduced model: unanswered
(How does this compare to the coefficient in the previous model with all of the variables?)
Enter the model R2:
RegN2O <- lm(Temp ~ MEI + TSI + Aerosols + N2O, data = train_cc)
summary(RegN2O)
##
## Call:
## lm(formula = Temp ~ MEI + TSI + Aerosols + N2O, data = train_cc)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.27916 -0.05975 -0.00595 0.05672 0.34195
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -1.162e+02 2.022e+01 -5.747 2.37e-08 ***
## MEI 6.419e-02 6.652e-03 9.649 < 2e-16 ***
## TSI 7.949e-02 1.487e-02 5.344 1.89e-07 ***
## Aerosols -1.702e+00 2.180e-01 -7.806 1.19e-13 ***
## N2O 2.532e-02 1.311e-03 19.307 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.09547 on 279 degrees of freedom
## Multiple R-squared: 0.7261, Adjusted R-squared: 0.7222
## F-statistic: 184.9 on 4 and 279 DF, p-value: < 2.2e-16
We have many variables in this problem, and as we have seen above, dropping some from the model does not decrease model quality. R provides a function, step, that will automate the procedure of trying different combinations of variables to find a good compromise of model simplicity and R2. This trade-off is formalized by the Akaike information criterion (AIC) - it can be informally thought of as the quality of the model with a penalty for the number of variables in the model.
The step function has one argument - the name of the initial model. It returns a simplified model. Use the step function in R to derive a new model, with the full model as the initial model (HINT: If your initial full model was called “climateLM”, you could create a new model with the step function by typing step(climateLM). Be sure to save your new model to a variable name so that you can look at the summary. For more information about the step function, type ?step in your R console.)
Enter the R2 value of the model produced by the step function:
Reg_wStep <- step(RegTemp)
## Start: AIC=-1348.16
## Temp ~ MEI + CO2 + CH4 + N2O + CFC.11 + CFC.12 + TSI + Aerosols
##
## Df Sum of Sq RSS AIC
## - CH4 1 0.00049 2.3135 -1350.1
## <none> 2.3130 -1348.2
## - N2O 1 0.03132 2.3443 -1346.3
## - CO2 1 0.06719 2.3802 -1342.0
## - CFC.12 1 0.11874 2.4318 -1335.9
## - CFC.11 1 0.13986 2.4529 -1333.5
## - TSI 1 0.33516 2.6482 -1311.7
## - Aerosols 1 0.43727 2.7503 -1301.0
## - MEI 1 0.82823 3.1412 -1263.2
##
## Step: AIC=-1350.1
## Temp ~ MEI + CO2 + N2O + CFC.11 + CFC.12 + TSI + Aerosols
##
## Df Sum of Sq RSS AIC
## <none> 2.3135 -1350.1
## - N2O 1 0.03133 2.3448 -1348.3
## - CO2 1 0.06672 2.3802 -1344.0
## - CFC.12 1 0.13023 2.4437 -1336.5
## - CFC.11 1 0.13938 2.4529 -1335.5
## - TSI 1 0.33500 2.6485 -1313.7
## - Aerosols 1 0.43987 2.7534 -1302.7
## - MEI 1 0.83118 3.1447 -1264.9
summary(Reg_wStep)
##
## Call:
## lm(formula = Temp ~ MEI + CO2 + N2O + CFC.11 + CFC.12 + TSI +
## Aerosols, data = train_cc)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.25770 -0.05994 -0.00104 0.05588 0.32203
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -1.245e+02 1.985e+01 -6.273 1.37e-09 ***
## MEI 6.407e-02 6.434e-03 9.958 < 2e-16 ***
## CO2 6.402e-03 2.269e-03 2.821 0.005129 **
## N2O -1.602e-02 8.287e-03 -1.933 0.054234 .
## CFC.11 -6.609e-03 1.621e-03 -4.078 5.95e-05 ***
## CFC.12 3.868e-03 9.812e-04 3.942 0.000103 ***
## TSI 9.312e-02 1.473e-02 6.322 1.04e-09 ***
## Aerosols -1.540e+00 2.126e-01 -7.244 4.36e-12 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.09155 on 276 degrees of freedom
## Multiple R-squared: 0.7508, Adjusted R-squared: 0.7445
## F-statistic: 118.8 on 7 and 276 DF, p-value: < 2.2e-16
We have developed an understanding of how well we can fit a linear regression to the training data, but does the model quality hold when applied to unseen data?
Using the model produced from the step function, calculate temperature predictions for the testing data set, using the predict function.
Enter the testing set R2:
Pred_new <- predict(Reg_wStep, newdata = test_cc)
SSE <- sum((test_cc$Temp - Pred_new)^2)
SST <- sum((test_cc$Temp - mean(train_cc$Temp))^2)
R_sq <- 1 - SSE/SST
R_sq
## [1] 0.6286051
The Programme for International Student Assessment (PISA) is a test given every three years to 15-year-old students from around the world to evaluate their performance in mathematics, reading, and science. This test provides a quantitative way to compare the performance of students from different parts of the world. In this homework assignment, we will predict the reading scores of students from the United States of America on the 2009 PISA exam.
The datasets pisa2009train.csv and pisa2009test.csv contain information about the demographics and schools for American students taking the exam, derived from 2009 PISA Public-Use Data Files distributed by the United States National Center for Education Statistics (NCES). While the datasets are not supposed to contain identifying information about students taking the test, by using the data you are bound by the NCES data use agreement, which prohibits any attempt to determine the identity of any student in the datasets.
Each row in the datasets pisa2009train.csv and pisa2009test.csv represents one student taking the exam. The datasets have the following variables:
grade: The grade in school of the student (most 15-year-olds in America are in 10th grade)
male: Whether the student is male (1/0)
raceeth: The race/ethnicity composite of the student
preschool: Whether the student attended preschool (1/0)
expectBachelors: Whether the student expects to obtain a bachelor’s degree (1/0)
motherHS: Whether the student’s mother completed high school (1/0)
motherBachelors: Whether the student’s mother obtained a bachelor’s degree (1/0)
motherWork: Whether the student’s mother has part-time or full-time work (1/0)
fatherHS: Whether the student’s father completed high school (1/0)
fatherBachelors: Whether the student’s father obtained a bachelor’s degree (1/0)
fatherWork: Whether the student’s father has part-time or full-time work (1/0)
selfBornUS: Whether the student was born in the United States of America (1/0)
motherBornUS: Whether the student’s mother was born in the United States of America (1/0)
fatherBornUS: Whether the student’s father was born in the United States of America (1/0)
englishAtHome: Whether the student speaks English at home (1/0)
computerForSchoolwork: Whether the student has access to a computer for schoolwork (1/0)
read30MinsADay: Whether the student reads for pleasure for 30 minutes/day (1/0)
minutesPerWeekEnglish: The number of minutes per week the student spend in English class
studentsInEnglish: The number of students in this student’s English class at school
schoolHasLibrary: Whether this student’s school has a library (1/0)
publicSchool: Whether this student attends a public school (1/0)
urban: Whether this student’s school is in an urban area (1/0)
schoolSize: The number of students in this student’s school
readingScore: The student’s reading score, on a 1000-point scale
Load the training and testing sets using the read.csv() function, and save them as variables with the names pisaTrain and pisaTest.
How many students are there in the training set?
pisaTrain <- read.csv("pisa2009train.csv")
pisaTest <- read.csv("pisa2009test.csv")
nrow(pisaTrain)
## [1] 3663
Using tapply() on pisaTrain, what is the average reading test score of males? Check this link for tapply: http://makemeanalyst.com/r-programming/loop-functions/tapply/ 1 = male 0 = female
str(pisaTrain)
## 'data.frame': 3663 obs. of 24 variables:
## $ grade : int 11 11 9 10 10 10 10 10 9 10 ...
## $ male : int 1 1 1 0 1 1 0 0 0 1 ...
## $ raceeth : chr NA "White" "White" "Black" ...
## $ preschool : int NA 0 1 1 1 1 0 1 1 1 ...
## $ expectBachelors : int 0 0 1 1 0 1 1 1 0 1 ...
## $ motherHS : int NA 1 1 0 1 NA 1 1 1 1 ...
## $ motherBachelors : int NA 1 1 0 0 NA 0 0 NA 1 ...
## $ motherWork : int 1 1 1 1 1 1 1 0 1 1 ...
## $ fatherHS : int NA 1 1 1 1 1 NA 1 0 0 ...
## $ fatherBachelors : int NA 0 NA 0 0 0 NA 0 NA 0 ...
## $ fatherWork : int 1 1 1 1 0 1 NA 1 1 1 ...
## $ selfBornUS : int 1 1 1 1 1 1 0 1 1 1 ...
## $ motherBornUS : int 0 1 1 1 1 1 1 1 1 1 ...
## $ fatherBornUS : int 0 1 1 1 0 1 NA 1 1 1 ...
## $ englishAtHome : int 0 1 1 1 1 1 1 1 1 1 ...
## $ computerForSchoolwork: int 1 1 1 1 1 1 1 1 1 1 ...
## $ read30MinsADay : int 0 1 0 1 1 0 0 1 0 0 ...
## $ minutesPerWeekEnglish: int 225 450 250 200 250 300 250 300 378 294 ...
## $ studentsInEnglish : int NA 25 28 23 35 20 28 30 20 24 ...
## $ schoolHasLibrary : int 1 1 1 1 1 1 1 1 0 1 ...
## $ publicSchool : int 1 1 1 1 1 1 1 1 1 1 ...
## $ urban : int 1 0 0 1 1 0 1 0 1 0 ...
## $ schoolSize : int 673 1173 1233 2640 1095 227 2080 1913 502 899 ...
## $ readingScore : num 476 575 555 458 614 ...
tapply(pisaTrain$readingScore, pisaTrain$male, FUN = mean)
## 0 1
## 512.9406 483.5325
summary(pisaTrain)
## grade male raceeth preschool
## Min. : 8.00 Min. :0.0000 Length:3663 Min. :0.0000
## 1st Qu.:10.00 1st Qu.:0.0000 Class :character 1st Qu.:0.0000
## Median :10.00 Median :1.0000 Mode :character Median :1.0000
## Mean :10.09 Mean :0.5111 Mean :0.7228
## 3rd Qu.:10.00 3rd Qu.:1.0000 3rd Qu.:1.0000
## Max. :12.00 Max. :1.0000 Max. :1.0000
## NA's :56
## expectBachelors motherHS motherBachelors motherWork
## Min. :0.0000 Min. :0.00 Min. :0.0000 Min. :0.0000
## 1st Qu.:1.0000 1st Qu.:1.00 1st Qu.:0.0000 1st Qu.:0.0000
## Median :1.0000 Median :1.00 Median :0.0000 Median :1.0000
## Mean :0.7859 Mean :0.88 Mean :0.3481 Mean :0.7345
## 3rd Qu.:1.0000 3rd Qu.:1.00 3rd Qu.:1.0000 3rd Qu.:1.0000
## Max. :1.0000 Max. :1.00 Max. :1.0000 Max. :1.0000
## NA's :62 NA's :97 NA's :397 NA's :93
## fatherHS fatherBachelors fatherWork selfBornUS
## Min. :0.0000 Min. :0.0000 Min. :0.0000 Min. :0.0000
## 1st Qu.:1.0000 1st Qu.:0.0000 1st Qu.:1.0000 1st Qu.:1.0000
## Median :1.0000 Median :0.0000 Median :1.0000 Median :1.0000
## Mean :0.8593 Mean :0.3319 Mean :0.8531 Mean :0.9313
## 3rd Qu.:1.0000 3rd Qu.:1.0000 3rd Qu.:1.0000 3rd Qu.:1.0000
## Max. :1.0000 Max. :1.0000 Max. :1.0000 Max. :1.0000
## NA's :245 NA's :569 NA's :233 NA's :69
## motherBornUS fatherBornUS englishAtHome computerForSchoolwork
## Min. :0.0000 Min. :0.0000 Min. :0.0000 Min. :0.0000
## 1st Qu.:1.0000 1st Qu.:1.0000 1st Qu.:1.0000 1st Qu.:1.0000
## Median :1.0000 Median :1.0000 Median :1.0000 Median :1.0000
## Mean :0.7725 Mean :0.7668 Mean :0.8717 Mean :0.8994
## 3rd Qu.:1.0000 3rd Qu.:1.0000 3rd Qu.:1.0000 3rd Qu.:1.0000
## Max. :1.0000 Max. :1.0000 Max. :1.0000 Max. :1.0000
## NA's :71 NA's :113 NA's :71 NA's :65
## read30MinsADay minutesPerWeekEnglish studentsInEnglish schoolHasLibrary
## Min. :0.0000 Min. : 0.0 Min. : 1.0 Min. :0.0000
## 1st Qu.:0.0000 1st Qu.: 225.0 1st Qu.:20.0 1st Qu.:1.0000
## Median :0.0000 Median : 250.0 Median :25.0 Median :1.0000
## Mean :0.2899 Mean : 266.2 Mean :24.5 Mean :0.9676
## 3rd Qu.:1.0000 3rd Qu.: 300.0 3rd Qu.:30.0 3rd Qu.:1.0000
## Max. :1.0000 Max. :2400.0 Max. :75.0 Max. :1.0000
## NA's :34 NA's :186 NA's :249 NA's :143
## publicSchool urban schoolSize readingScore
## Min. :0.0000 Min. :0.0000 Min. : 100 Min. :168.6
## 1st Qu.:1.0000 1st Qu.:0.0000 1st Qu.: 712 1st Qu.:431.7
## Median :1.0000 Median :0.0000 Median :1212 Median :499.7
## Mean :0.9339 Mean :0.3849 Mean :1369 Mean :497.9
## 3rd Qu.:1.0000 3rd Qu.:1.0000 3rd Qu.:1900 3rd Qu.:566.2
## Max. :1.0000 Max. :1.0000 Max. :6694 Max. :746.0
## NA's :162
Linear regression discards observations with missing data, so we will remove all such observations from the training and testing sets. Later in the course, we will learn about imputation, which deals with missing data by filling in missing values with plausible information.
Type the following commands into your R console to remove observations with any missing value from pisaTrain and pisaTest:
pisaTrain = na.omit(pisaTrain)
pisaTest = na.omit(pisaTest)
How many observations are now in the training set?
pisaTrain <- na.omit(pisaTrain)
pisaTest <- na.omit(pisaTest)
Factor variables are variables that take on a discrete set of values, like the “Region” variable in the WHO dataset from the second lecture of Unit 1. This is an unordered factor because there isn’t any natural ordering between the levels. An ordered factor has a natural ordering between the levels (an example would be the classifications “large,” “medium,” and “small”).
Which of the following variables is an unordered factor with at least 3 levels? (Select all that apply.)
str(pisaTrain)
## 'data.frame': 2414 obs. of 24 variables:
## $ grade : int 11 10 10 10 10 10 10 10 11 9 ...
## $ male : int 1 0 1 0 1 0 0 0 1 1 ...
## $ raceeth : chr "White" "Black" "Hispanic" "White" ...
## $ preschool : int 0 1 1 1 1 1 1 1 1 1 ...
## $ expectBachelors : int 0 1 0 1 1 1 1 0 1 1 ...
## $ motherHS : int 1 0 1 1 1 1 1 0 1 1 ...
## $ motherBachelors : int 1 0 0 0 1 0 0 0 0 1 ...
## $ motherWork : int 1 1 1 0 1 1 1 0 0 1 ...
## $ fatherHS : int 1 1 1 1 0 1 1 0 1 1 ...
## $ fatherBachelors : int 0 0 0 0 0 0 1 0 1 1 ...
## $ fatherWork : int 1 1 0 1 1 0 1 1 1 1 ...
## $ selfBornUS : int 1 1 1 1 1 0 1 0 1 1 ...
## $ motherBornUS : int 1 1 1 1 1 0 1 0 1 1 ...
## $ fatherBornUS : int 1 1 0 1 1 0 1 0 1 1 ...
## $ englishAtHome : int 1 1 1 1 1 0 1 0 1 1 ...
## $ computerForSchoolwork: int 1 1 1 1 1 0 1 1 1 1 ...
## $ read30MinsADay : int 1 1 1 1 0 1 1 1 0 0 ...
## $ minutesPerWeekEnglish: int 450 200 250 300 294 232 225 270 275 225 ...
## $ studentsInEnglish : int 25 23 35 30 24 14 20 25 30 15 ...
## $ schoolHasLibrary : int 1 1 1 1 1 1 1 1 1 1 ...
## $ publicSchool : int 1 1 1 1 1 1 1 1 1 0 ...
## $ urban : int 0 1 1 0 0 0 0 1 1 1 ...
## $ schoolSize : int 1173 2640 1095 1913 899 1733 149 1400 1988 915 ...
## $ readingScore : num 575 458 614 439 466 ...
## - attr(*, "na.action")= 'omit' Named int [1:1249] 1 3 6 7 9 11 13 21 29 30 ...
## ..- attr(*, "names")= chr [1:1249] "1" "3" "6" "7" ...
pisaTrain$raceeth <- as.factor(pisaTrain$raceeth)
class(pisaTrain$raceeth)
## [1] "factor"
str(pisaTrain)
## 'data.frame': 2414 obs. of 24 variables:
## $ grade : int 11 10 10 10 10 10 10 10 11 9 ...
## $ male : int 1 0 1 0 1 0 0 0 1 1 ...
## $ raceeth : Factor w/ 7 levels "American Indian/Alaska Native",..: 7 3 4 7 5 4 7 4 7 7 ...
## $ preschool : int 0 1 1 1 1 1 1 1 1 1 ...
## $ expectBachelors : int 0 1 0 1 1 1 1 0 1 1 ...
## $ motherHS : int 1 0 1 1 1 1 1 0 1 1 ...
## $ motherBachelors : int 1 0 0 0 1 0 0 0 0 1 ...
## $ motherWork : int 1 1 1 0 1 1 1 0 0 1 ...
## $ fatherHS : int 1 1 1 1 0 1 1 0 1 1 ...
## $ fatherBachelors : int 0 0 0 0 0 0 1 0 1 1 ...
## $ fatherWork : int 1 1 0 1 1 0 1 1 1 1 ...
## $ selfBornUS : int 1 1 1 1 1 0 1 0 1 1 ...
## $ motherBornUS : int 1 1 1 1 1 0 1 0 1 1 ...
## $ fatherBornUS : int 1 1 0 1 1 0 1 0 1 1 ...
## $ englishAtHome : int 1 1 1 1 1 0 1 0 1 1 ...
## $ computerForSchoolwork: int 1 1 1 1 1 0 1 1 1 1 ...
## $ read30MinsADay : int 1 1 1 1 0 1 1 1 0 0 ...
## $ minutesPerWeekEnglish: int 450 200 250 300 294 232 225 270 275 225 ...
## $ studentsInEnglish : int 25 23 35 30 24 14 20 25 30 15 ...
## $ schoolHasLibrary : int 1 1 1 1 1 1 1 1 1 1 ...
## $ publicSchool : int 1 1 1 1 1 1 1 1 1 0 ...
## $ urban : int 0 1 1 0 0 0 0 1 1 1 ...
## $ schoolSize : int 1173 2640 1095 1913 899 1733 149 1400 1988 915 ...
## $ readingScore : num 575 458 614 439 466 ...
## - attr(*, "na.action")= 'omit' Named int [1:1249] 1 3 6 7 9 11 13 21 29 30 ...
## ..- attr(*, "names")= chr [1:1249] "1" "3" "6" "7" ...
summary(pisaTrain)
## grade male raceeth
## Min. : 8.00 Min. :0.0000 American Indian/Alaska Native : 20
## 1st Qu.:10.00 1st Qu.:0.0000 Asian : 95
## Median :10.00 Median :1.0000 Black : 228
## Mean :10.13 Mean :0.5012 Hispanic : 500
## 3rd Qu.:10.00 3rd Qu.:1.0000 More than one race : 81
## Max. :12.00 Max. :1.0000 Native Hawaiian/Other Pacific Islander: 20
## White :1470
## preschool expectBachelors motherHS motherBachelors
## Min. :0.0000 Min. :0.0000 Min. :0.000 Min. :0.0000
## 1st Qu.:0.0000 1st Qu.:1.0000 1st Qu.:1.000 1st Qu.:0.0000
## Median :1.0000 Median :1.0000 Median :1.000 Median :0.0000
## Mean :0.7274 Mean :0.8343 Mean :0.896 Mean :0.3637
## 3rd Qu.:1.0000 3rd Qu.:1.0000 3rd Qu.:1.000 3rd Qu.:1.0000
## Max. :1.0000 Max. :1.0000 Max. :1.000 Max. :1.0000
##
## motherWork fatherHS fatherBachelors fatherWork
## Min. :0.0000 Min. :0.0000 Min. :0.0000 Min. :0.0000
## 1st Qu.:0.0000 1st Qu.:1.0000 1st Qu.:0.0000 1st Qu.:1.0000
## Median :1.0000 Median :1.0000 Median :0.0000 Median :1.0000
## Mean :0.7357 Mean :0.8741 Mean :0.3484 Mean :0.8571
## 3rd Qu.:1.0000 3rd Qu.:1.0000 3rd Qu.:1.0000 3rd Qu.:1.0000
## Max. :1.0000 Max. :1.0000 Max. :1.0000 Max. :1.0000
##
## selfBornUS motherBornUS fatherBornUS englishAtHome
## Min. :0.0000 Min. :0.00 Min. :0.0000 Min. :0.0000
## 1st Qu.:1.0000 1st Qu.:1.00 1st Qu.:1.0000 1st Qu.:1.0000
## Median :1.0000 Median :1.00 Median :1.0000 Median :1.0000
## Mean :0.9362 Mean :0.79 Mean :0.7854 Mean :0.8815
## 3rd Qu.:1.0000 3rd Qu.:1.00 3rd Qu.:1.0000 3rd Qu.:1.0000
## Max. :1.0000 Max. :1.00 Max. :1.0000 Max. :1.0000
##
## computerForSchoolwork read30MinsADay minutesPerWeekEnglish studentsInEnglish
## Min. :0.0000 Min. :0.0000 Min. : 0.0 Min. : 1.00
## 1st Qu.:1.0000 1st Qu.:0.0000 1st Qu.: 225.0 1st Qu.:20.00
## Median :1.0000 Median :0.0000 Median : 250.0 Median :25.00
## Mean :0.9155 Mean :0.3016 Mean : 269.8 Mean :24.56
## 3rd Qu.:1.0000 3rd Qu.:1.0000 3rd Qu.: 300.0 3rd Qu.:30.00
## Max. :1.0000 Max. :1.0000 Max. :1680.0 Max. :75.00
##
## schoolHasLibrary publicSchool urban schoolSize
## Min. :0.0000 Min. :0.0000 Min. :0.0000 Min. : 100
## 1st Qu.:1.0000 1st Qu.:1.0000 1st Qu.:0.0000 1st Qu.: 712
## Median :1.0000 Median :1.0000 Median :0.0000 Median :1233
## Mean :0.9714 Mean :0.9176 Mean :0.3629 Mean :1372
## 3rd Qu.:1.0000 3rd Qu.:1.0000 3rd Qu.:1.0000 3rd Qu.:1900
## Max. :1.0000 Max. :1.0000 Max. :1.0000 Max. :6694
##
## readingScore
## Min. :244.5
## 1st Qu.:455.8
## Median :520.2
## Mean :518.0
## 3rd Qu.:581.4
## Max. :746.0
##
To include unordered factors in a linear regression model, we define one level as the “reference level” and add a binary variable for each of the remaining levels. In this way, a factor with n levels is replaced by n-1 binary variables. The reference level is typically selected to be the most frequently occurring level in the dataset.
As an example, consider the unordered factor variable “color”, with levels “red”, “green”, and “blue”. If “green” were the reference level, then we would add binary variables “colorred” and “colorblue” to a linear regression problem. All red examples would have colorred=1 and colorblue=0. All blue examples would have colorred=0 and colorblue=1. All green examples would have colorred=0 and colorblue=0.
Now, consider the variable “raceeth” in our problem, which has levels “American Indian/Alaska Native”, “Asian”, “Black”, “Hispanic”, “More than one race”, “Native Hawaiian/Other Pacific Islander”, and “White”. Because it is the most common in our population, we will select White as the reference level.
Which binary variables will be included in the regression model? (Select all that apply.)
Because the race variable takes on text values, it was loaded as a factor variable when we read in the dataset with read.csv() – you can see this when you run str(pisaTrain) or str(pisaTest). However, by default R selects the first level alphabetically (“American Indian/Alaska Native”) as the reference level of our factor instead of the most common level (“White”). Set the reference level of the factor by typing the following two lines in your R console:
pisaTrain\(raceeth = relevel(pisaTrain\)raceeth, “White”)
pisaTest\(raceeth = relevel(pisaTest\)raceeth, “White”)
Now, build a linear regression model (call it lmScore) using the training set to predict readingScore using all the remaining variables.
It would be time-consuming to type all the variables, but R provides the shorthand notation “readingScore ~ .” to mean “predict readingScore using all the other variables in the data frame.” The period is used to replace listing out all of the independent variables. As an example, if your dependent variable is called “Y”, your independent variables are called “X1”, “X2”, and “X3”, and your training data set is called “Train”, instead of the regular notation:
LinReg = lm(Y ~ X1 + X2 + X3, data = Train)
You would use the following command to build your model:
LinReg = lm(Y ~ ., data = Train)
What is the Multiple R-squared value of lmScore on the training set?
# RELEVELLING
str(pisaTrain)
## 'data.frame': 2414 obs. of 24 variables:
## $ grade : int 11 10 10 10 10 10 10 10 11 9 ...
## $ male : int 1 0 1 0 1 0 0 0 1 1 ...
## $ raceeth : Factor w/ 7 levels "American Indian/Alaska Native",..: 7 3 4 7 5 4 7 4 7 7 ...
## $ preschool : int 0 1 1 1 1 1 1 1 1 1 ...
## $ expectBachelors : int 0 1 0 1 1 1 1 0 1 1 ...
## $ motherHS : int 1 0 1 1 1 1 1 0 1 1 ...
## $ motherBachelors : int 1 0 0 0 1 0 0 0 0 1 ...
## $ motherWork : int 1 1 1 0 1 1 1 0 0 1 ...
## $ fatherHS : int 1 1 1 1 0 1 1 0 1 1 ...
## $ fatherBachelors : int 0 0 0 0 0 0 1 0 1 1 ...
## $ fatherWork : int 1 1 0 1 1 0 1 1 1 1 ...
## $ selfBornUS : int 1 1 1 1 1 0 1 0 1 1 ...
## $ motherBornUS : int 1 1 1 1 1 0 1 0 1 1 ...
## $ fatherBornUS : int 1 1 0 1 1 0 1 0 1 1 ...
## $ englishAtHome : int 1 1 1 1 1 0 1 0 1 1 ...
## $ computerForSchoolwork: int 1 1 1 1 1 0 1 1 1 1 ...
## $ read30MinsADay : int 1 1 1 1 0 1 1 1 0 0 ...
## $ minutesPerWeekEnglish: int 450 200 250 300 294 232 225 270 275 225 ...
## $ studentsInEnglish : int 25 23 35 30 24 14 20 25 30 15 ...
## $ schoolHasLibrary : int 1 1 1 1 1 1 1 1 1 1 ...
## $ publicSchool : int 1 1 1 1 1 1 1 1 1 0 ...
## $ urban : int 0 1 1 0 0 0 0 1 1 1 ...
## $ schoolSize : int 1173 2640 1095 1913 899 1733 149 1400 1988 915 ...
## $ readingScore : num 575 458 614 439 466 ...
## - attr(*, "na.action")= 'omit' Named int [1:1249] 1 3 6 7 9 11 13 21 29 30 ...
## ..- attr(*, "names")= chr [1:1249] "1" "3" "6" "7" ...
str(pisaTest)
## 'data.frame': 990 obs. of 24 variables:
## $ grade : int 10 10 10 10 11 10 10 10 10 10 ...
## $ male : int 0 0 0 0 0 1 0 1 1 0 ...
## $ raceeth : chr "White" "White" "American Indian/Alaska Native" "White" ...
## $ preschool : int 1 1 1 1 0 1 0 1 1 1 ...
## $ expectBachelors : int 0 1 0 0 0 1 1 0 1 1 ...
## $ motherHS : int 1 1 1 1 1 1 1 1 1 1 ...
## $ motherBachelors : int 1 0 0 0 1 1 0 0 1 0 ...
## $ motherWork : int 1 0 0 1 1 1 0 1 1 1 ...
## $ fatherHS : int 1 1 1 1 1 1 1 1 1 1 ...
## $ fatherBachelors : int 0 1 0 0 1 0 0 0 1 1 ...
## $ fatherWork : int 0 1 0 1 1 1 1 0 1 1 ...
## $ selfBornUS : int 1 1 1 1 1 1 1 1 1 1 ...
## $ motherBornUS : int 1 1 1 1 1 1 1 1 1 1 ...
## $ fatherBornUS : int 1 1 1 1 1 1 1 1 1 1 ...
## $ englishAtHome : int 1 1 1 1 1 1 1 1 1 1 ...
## $ computerForSchoolwork: int 1 1 1 1 1 1 1 1 1 1 ...
## $ read30MinsADay : int 0 0 1 1 1 1 0 0 0 1 ...
## $ minutesPerWeekEnglish: int 240 240 240 270 270 350 350 360 350 360 ...
## $ studentsInEnglish : int 30 30 30 35 30 25 27 28 25 27 ...
## $ schoolHasLibrary : int 1 1 1 1 1 1 1 1 1 1 ...
## $ publicSchool : int 1 1 1 1 1 1 1 1 1 1 ...
## $ urban : int 0 0 0 0 0 0 0 0 0 0 ...
## $ schoolSize : int 808 808 808 808 808 899 899 899 899 899 ...
## $ readingScore : num 355 454 405 665 605 ...
## - attr(*, "na.action")= 'omit' Named int [1:580] 2 3 4 6 12 16 17 19 22 23 ...
## ..- attr(*, "names")= chr [1:580] "2" "3" "4" "6" ...
pisaTrain$raceeth <- relevel(pisaTrain$raceeth, "White")
pisaTest$raceeth <- as.factor(pisaTest$raceeth)
pisaTest$raceeth <- relevel(pisaTest$raceeth, "White")
# https://stackoverflow.com/questions/33180058/coerce-multiple-columns-to-factors-at-once
str(pisaTrain)
## 'data.frame': 2414 obs. of 24 variables:
## $ grade : int 11 10 10 10 10 10 10 10 11 9 ...
## $ male : int 1 0 1 0 1 0 0 0 1 1 ...
## $ raceeth : Factor w/ 7 levels "White","American Indian/Alaska Native",..: 1 4 5 1 6 5 1 5 1 1 ...
## $ preschool : int 0 1 1 1 1 1 1 1 1 1 ...
## $ expectBachelors : int 0 1 0 1 1 1 1 0 1 1 ...
## $ motherHS : int 1 0 1 1 1 1 1 0 1 1 ...
## $ motherBachelors : int 1 0 0 0 1 0 0 0 0 1 ...
## $ motherWork : int 1 1 1 0 1 1 1 0 0 1 ...
## $ fatherHS : int 1 1 1 1 0 1 1 0 1 1 ...
## $ fatherBachelors : int 0 0 0 0 0 0 1 0 1 1 ...
## $ fatherWork : int 1 1 0 1 1 0 1 1 1 1 ...
## $ selfBornUS : int 1 1 1 1 1 0 1 0 1 1 ...
## $ motherBornUS : int 1 1 1 1 1 0 1 0 1 1 ...
## $ fatherBornUS : int 1 1 0 1 1 0 1 0 1 1 ...
## $ englishAtHome : int 1 1 1 1 1 0 1 0 1 1 ...
## $ computerForSchoolwork: int 1 1 1 1 1 0 1 1 1 1 ...
## $ read30MinsADay : int 1 1 1 1 0 1 1 1 0 0 ...
## $ minutesPerWeekEnglish: int 450 200 250 300 294 232 225 270 275 225 ...
## $ studentsInEnglish : int 25 23 35 30 24 14 20 25 30 15 ...
## $ schoolHasLibrary : int 1 1 1 1 1 1 1 1 1 1 ...
## $ publicSchool : int 1 1 1 1 1 1 1 1 1 0 ...
## $ urban : int 0 1 1 0 0 0 0 1 1 1 ...
## $ schoolSize : int 1173 2640 1095 1913 899 1733 149 1400 1988 915 ...
## $ readingScore : num 575 458 614 439 466 ...
## - attr(*, "na.action")= 'omit' Named int [1:1249] 1 3 6 7 9 11 13 21 29 30 ...
## ..- attr(*, "names")= chr [1:1249] "1" "3" "6" "7" ...
str(pisaTest)
## 'data.frame': 990 obs. of 24 variables:
## $ grade : int 10 10 10 10 11 10 10 10 10 10 ...
## $ male : int 0 0 0 0 0 1 0 1 1 0 ...
## $ raceeth : Factor w/ 7 levels "White","American Indian/Alaska Native",..: 1 1 2 1 1 5 1 5 1 5 ...
## $ preschool : int 1 1 1 1 0 1 0 1 1 1 ...
## $ expectBachelors : int 0 1 0 0 0 1 1 0 1 1 ...
## $ motherHS : int 1 1 1 1 1 1 1 1 1 1 ...
## $ motherBachelors : int 1 0 0 0 1 1 0 0 1 0 ...
## $ motherWork : int 1 0 0 1 1 1 0 1 1 1 ...
## $ fatherHS : int 1 1 1 1 1 1 1 1 1 1 ...
## $ fatherBachelors : int 0 1 0 0 1 0 0 0 1 1 ...
## $ fatherWork : int 0 1 0 1 1 1 1 0 1 1 ...
## $ selfBornUS : int 1 1 1 1 1 1 1 1 1 1 ...
## $ motherBornUS : int 1 1 1 1 1 1 1 1 1 1 ...
## $ fatherBornUS : int 1 1 1 1 1 1 1 1 1 1 ...
## $ englishAtHome : int 1 1 1 1 1 1 1 1 1 1 ...
## $ computerForSchoolwork: int 1 1 1 1 1 1 1 1 1 1 ...
## $ read30MinsADay : int 0 0 1 1 1 1 0 0 0 1 ...
## $ minutesPerWeekEnglish: int 240 240 240 270 270 350 350 360 350 360 ...
## $ studentsInEnglish : int 30 30 30 35 30 25 27 28 25 27 ...
## $ schoolHasLibrary : int 1 1 1 1 1 1 1 1 1 1 ...
## $ publicSchool : int 1 1 1 1 1 1 1 1 1 1 ...
## $ urban : int 0 0 0 0 0 0 0 0 0 0 ...
## $ schoolSize : int 808 808 808 808 808 899 899 899 899 899 ...
## $ readingScore : num 355 454 405 665 605 ...
## - attr(*, "na.action")= 'omit' Named int [1:580] 2 3 4 6 12 16 17 19 22 23 ...
## ..- attr(*, "names")= chr [1:580] "2" "3" "4" "6" ...
lmScore <- lm(readingScore ~ ., data = pisaTrain)
summary(lmScore)
##
## Call:
## lm(formula = readingScore ~ ., data = pisaTrain)
##
## Residuals:
## Min 1Q Median 3Q Max
## -247.44 -48.86 1.86 49.77 217.18
##
## Coefficients:
## Estimate Std. Error t value
## (Intercept) 143.766333 33.841226 4.248
## grade 29.542707 2.937399 10.057
## male -14.521653 3.155926 -4.601
## raceethAmerican Indian/Alaska Native -67.277327 16.786935 -4.008
## raceethAsian -4.110325 9.220071 -0.446
## raceethBlack -67.012347 5.460883 -12.271
## raceethHispanic -38.975486 5.177743 -7.528
## raceethMore than one race -16.922522 8.496268 -1.992
## raceethNative Hawaiian/Other Pacific Islander -5.101601 17.005696 -0.300
## preschool -4.463670 3.486055 -1.280
## expectBachelors 55.267080 4.293893 12.871
## motherHS 6.058774 6.091423 0.995
## motherBachelors 12.638068 3.861457 3.273
## motherWork -2.809101 3.521827 -0.798
## fatherHS 4.018214 5.579269 0.720
## fatherBachelors 16.929755 3.995253 4.237
## fatherWork 5.842798 4.395978 1.329
## selfBornUS -3.806278 7.323718 -0.520
## motherBornUS -8.798153 6.587621 -1.336
## fatherBornUS 4.306994 6.263875 0.688
## englishAtHome 8.035685 6.859492 1.171
## computerForSchoolwork 22.500232 5.702562 3.946
## read30MinsADay 34.871924 3.408447 10.231
## minutesPerWeekEnglish 0.012788 0.010712 1.194
## studentsInEnglish -0.286631 0.227819 -1.258
## schoolHasLibrary 12.215085 9.264884 1.318
## publicSchool -16.857475 6.725614 -2.506
## urban -0.110132 3.962724 -0.028
## schoolSize 0.006540 0.002197 2.977
## Pr(>|t|)
## (Intercept) 2.24e-05 ***
## grade < 2e-16 ***
## male 4.42e-06 ***
## raceethAmerican Indian/Alaska Native 6.32e-05 ***
## raceethAsian 0.65578
## raceethBlack < 2e-16 ***
## raceethHispanic 7.29e-14 ***
## raceethMore than one race 0.04651 *
## raceethNative Hawaiian/Other Pacific Islander 0.76421
## preschool 0.20052
## expectBachelors < 2e-16 ***
## motherHS 0.32001
## motherBachelors 0.00108 **
## motherWork 0.42517
## fatherHS 0.47147
## fatherBachelors 2.35e-05 ***
## fatherWork 0.18393
## selfBornUS 0.60331
## motherBornUS 0.18182
## fatherBornUS 0.49178
## englishAtHome 0.24153
## computerForSchoolwork 8.19e-05 ***
## read30MinsADay < 2e-16 ***
## minutesPerWeekEnglish 0.23264
## studentsInEnglish 0.20846
## schoolHasLibrary 0.18749
## publicSchool 0.01226 *
## urban 0.97783
## schoolSize 0.00294 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 73.81 on 2385 degrees of freedom
## Multiple R-squared: 0.3251, Adjusted R-squared: 0.3172
## F-statistic: 41.04 on 28 and 2385 DF, p-value: < 2.2e-16
What is the training-set root-mean squared error (RMSE) of lmScore?
lmScore$residuals
## 2 4 5 8 10
## 36.31787211 -37.51605148 177.01187510 -126.77397761 -34.73973862
## 12 14 15 16 17
## 10.14267892 75.75542736 -55.89032825 -139.54354721 119.08321504
## 18 19 20 22 23
## -66.52046990 8.63531209 -7.52648922 23.72319105 0.88260460
## 24 25 26 27 28
## -45.62525483 -89.85661256 83.02843828 -12.84770452 84.60587680
## 32 34 35 37 38
## 65.36940977 -6.22218538 48.70770989 18.48115249 -3.31047121
## 39 41 43 45 46
## 21.54795278 -55.07285292 69.24180067 -79.50116079 7.75050492
## 47 48 50 51 52
## 34.51041974 55.29728852 44.12451045 -13.36162980 -93.33446606
## 55 56 60 63 64
## 96.87617639 58.06520171 66.43937544 -70.43294270 -53.36549600
## 65 67 69 72 73
## -3.48872921 12.76479675 -159.90209303 -35.72666271 31.57567105
## 74 75 76 78 81
## 54.14567243 104.09705410 8.28203027 161.94638490 -38.50267426
## 84 86 87 88 91
## -8.66301052 -22.54719902 112.47110662 16.84323140 68.90706521
## 92 93 94 96 98
## 5.13611687 52.64118258 -10.06174103 -49.15935243 -32.52087660
## 100 101 105 106 108
## 83.66733324 -146.53431499 -241.48934454 4.91665336 56.46303695
## 109 113 115 116 117
## 40.58349883 -37.24847895 17.58375527 19.31208030 -113.94238772
## 118 120 121 123 125
## -8.88604135 25.97646262 -172.40067398 98.76252815 -5.46353639
## 126 128 130 131 132
## 84.09914317 -210.82840924 -105.66286358 132.71596754 -15.84528431
## 133 134 136 137 140
## -0.23922325 114.98960886 -178.47462583 52.65959796 -64.38743490
## 141 143 145 147 151
## 70.78621159 167.43499719 -91.33335425 -5.91170689 48.89601250
## 152 153 154 157 158
## -17.11816783 -18.96561327 -46.16953240 34.15578771 54.83678483
## 162 164 165 166 167
## 103.09449182 -29.81480533 10.25581508 -6.93385782 -39.38314279
## 168 169 170 171 173
## -60.23855093 23.26897510 128.39675660 -51.56540430 96.80204021
## 174 176 177 179 180
## -15.18885989 110.59582985 -44.65614946 36.74500074 -2.66591281
## 183 185 186 187 189
## -63.61779935 53.50476370 57.85049005 59.37974111 -105.53628053
## 191 192 196 197 198
## 99.41543189 21.71580603 -36.73259710 146.44040987 48.15744117
## 199 200 201 203 205
## 71.77257697 63.99002031 20.44220312 -41.50839875 5.53275280
## 206 207 208 209 210
## -16.40252659 46.58617286 49.72800711 72.61397207 -115.84887642
## 211 212 213 214 215
## 12.08885624 -70.55087514 -117.48568915 -84.36491766 -102.05302174
## 216 217 218 219 221
## 104.04202915 61.13522649 -40.07783292 -94.75282613 22.93543839
## 222 223 224 225 226
## 20.82751419 -78.89645624 -11.91477756 27.21161063 -5.22077464
## 228 229 230 232 234
## 3.88042725 -5.97244869 -7.89630439 131.80517545 -0.15651413
## 238 239 240 241 242
## -2.15171386 64.57694025 48.69876527 -161.94749820 -2.69858880
## 243 246 247 248 251
## 104.85689846 117.51589027 44.83397030 32.28799238 10.86060865
## 252 253 254 257 258
## -46.07642644 4.62171923 68.45015961 24.15220085 -206.54553019
## 259 260 262 265 266
## 5.16563130 -7.03015107 -4.86134293 -43.62964637 -56.22659202
## 268 269 270 271 273
## -185.00046642 181.40886306 -169.98059658 -12.52240177 17.00012757
## 275 276 277 278 280
## -58.30516981 -20.60534708 -56.08922505 -72.97686849 77.86558472
## 281 282 283 284 285
## -30.83528337 -99.16004608 40.78722846 -8.11743705 -33.51096294
## 286 288 289 291 292
## -20.11294587 51.04820974 198.11285741 -64.60004095 121.23464326
## 293 294 295 296 297
## 91.17824732 -30.04303693 -88.47977316 -126.99286783 41.47434080
## 298 299 301 302 303
## 31.90595983 119.30487969 30.85423324 151.35591415 14.32081854
## 305 308 312 313 315
## -123.10230609 -67.58786172 -33.95346598 7.27829704 12.63180748
## 319 321 322 325 327
## -60.99938804 75.00158344 -49.22303388 25.48693868 -63.19726081
## 328 329 330 333 335
## 41.55264981 11.51201901 -158.31810830 10.72047890 -75.63750718
## 337 339 340 341 342
## -66.70511899 115.74211886 -27.45700658 95.06555909 80.74264358
## 343 344 345 347 348
## -23.92279318 -82.41413942 2.55723957 -10.13887695 -21.23597431
## 349 350 351 353 354
## 15.43692523 -55.61705939 -50.82242797 -21.01844889 -16.96045847
## 355 356 358 363 366
## 151.58619932 16.31255619 130.85575629 -32.52943092 -30.25800763
## 367 368 369 370 375
## -33.39223977 -27.22553462 -72.06448412 5.85837006 133.65672848
## 376 379 383 386 387
## -13.42550032 30.24790027 36.60859827 18.58508451 -86.85900574
## 389 390 392 393 395
## 52.58823229 39.16618221 124.62458310 75.92767981 31.83691965
## 396 397 398 399 400
## 88.47701063 -113.38358367 67.09618588 -27.14416007 -113.51944602
## 403 404 405 407 408
## 168.58384505 16.85439908 99.29272741 -10.00289021 165.89791685
## 409 411 412 413 416
## 46.71335791 -50.99941091 -37.05676255 54.21406591 -3.30544639
## 418 419 421 423 424
## 41.19076020 -78.32231517 7.14111113 -80.82197293 2.66612334
## 425 426 427 428 431
## 21.55935394 -118.41546642 78.46411310 59.74757277 21.57915589
## 432 433 434 435 436
## -165.32273716 131.02678718 -39.83912026 20.64022158 -0.13674453
## 438 439 443 444 445
## 85.35453697 11.07076419 93.22164419 -30.11185615 43.70636921
## 446 447 448 449 452
## -12.72781591 -126.97543026 26.51014581 23.22077808 61.03806470
## 453 454 455 457 458
## -13.63343340 -70.61356114 29.27218719 19.20161639 3.60367353
## 459 461 462 463 464
## -62.29082338 5.21128297 -39.65706262 -20.90843407 82.17748637
## 465 466 467 469 470
## 22.52826968 38.93478890 -93.29352567 23.76664599 -41.31253300
## 474 475 476 478 479
## 29.70960038 125.76420536 -63.48949768 34.44634498 -25.40178019
## 481 482 483 485 486
## 59.15622035 -16.88642487 118.51626665 -68.11912613 62.44471870
## 487 488 489 491 493
## 109.22039792 -172.58851612 -16.73079663 27.37296601 -21.97287197
## 494 496 498 500 501
## 21.08406726 -13.71048238 94.33458074 73.81514050 -18.93671564
## 502 504 506 507 508
## -152.86085311 31.05221852 47.39558011 -59.60838642 -76.29150805
## 510 511 512 517 518
## 17.50178209 -27.43596883 -60.57104012 -166.43372210 -57.24232083
## 519 521 524 525 526
## -116.49500057 24.92419042 10.06699354 40.53386084 -19.68627186
## 530 531 532 534 536
## 67.56799094 152.82850981 0.89489809 -94.11119133 36.89681625
## 537 540 541 543 544
## -35.98195163 -115.60724146 0.20210622 23.35474138 -48.87174604
## 545 546 547 548 549
## 19.96912779 23.96163298 66.49065496 32.56051636 -9.72132600
## 550 552 554 555 556
## -129.10609851 -1.39372632 58.47489771 14.40994318 99.67624266
## 557 558 559 561 562
## 149.97717920 -45.86713150 142.60286673 58.52247155 -80.01646666
## 563 564 565 566 567
## 17.52715072 -27.20846079 -26.46207047 14.60388632 48.33068167
## 569 571 572 574 575
## -76.20562093 -29.69490773 -119.90601619 0.10457971 50.11671945
## 576 579 580 582 587
## -44.78434284 20.66737591 45.47362107 3.95806635 37.07351642
## 589 590 592 593 594
## -5.40396185 -119.44990748 -99.81781711 -75.36791811 -42.76076949
## 597 599 600 601 602
## 51.53458985 47.42856062 -50.60243893 105.66662253 -15.18253926
## 603 604 605 607 608
## -3.78389326 -8.52299669 -131.31217222 -17.31328815 -51.21514185
## 612 615 616 617 621
## -92.51079042 -125.07549425 -50.59382922 51.27872325 -54.12715803
## 622 623 625 626 627
## -57.78279472 86.93285080 -105.20826215 62.79579233 -31.43112759
## 628 629 633 634 635
## 120.15112497 5.14200131 -34.39397345 14.47960676 37.18634559
## 637 638 640 644 645
## -82.74569906 -160.46927997 -6.94487755 -11.12093705 -42.68022834
## 646 647 648 650 651
## 74.57449983 -21.92423887 -28.29176088 36.14612045 -39.32356863
## 652 655 656 658 659
## 58.22965251 -53.86099072 -63.16834601 6.32355279 -56.07909516
## 661 662 664 669 671
## 102.55009028 117.23300618 75.85492351 -110.45723532 -1.27601566
## 672 673 674 675 676
## -88.80476583 -80.39648897 4.90177901 -52.85484241 -0.59186978
## 677 678 679 683 685
## -25.49219547 -148.91490797 104.53269604 133.23702825 101.25960447
## 686 687 688 690 691
## 46.50303234 23.34009009 125.28066905 -19.53479975 -52.79990571
## 692 693 696 697 698
## 40.69600837 -10.96791718 44.42569685 97.91429948 70.82130403
## 701 702 703 705 706
## 57.07133501 -17.67384468 -29.90513927 11.62398055 -96.31738843
## 707 708 709 710 711
## -21.83819073 -49.11758685 -22.81172496 106.94528681 -28.99383391
## 713 716 718 721 722
## -186.62795544 -130.70421997 -46.65671538 25.17226119 -167.08604468
## 724 725 726 727 728
## 10.11726317 113.64416257 -78.59313096 66.29640856 -6.63437783
## 730 731 732 734 735
## -38.83736710 -26.78838876 -38.25261691 -75.77579694 85.73088367
## 736 738 739 740 741
## 8.87409239 43.49269767 99.70375594 -16.18747933 70.97910121
## 742 744 745 747 749
## 114.75210892 -31.88860726 44.71518490 -34.12028923 11.88172385
## 750 751 752 753 755
## -19.37215506 2.66968510 107.10241787 1.14647415 7.71757359
## 756 757 758 759 760
## -107.87754041 -113.80326067 41.96318811 81.44587001 34.78988622
## 761 763 764 765 767
## 95.07428644 71.34019304 19.55796351 172.70343581 84.38032356
## 768 769 770 771 772
## -29.68187219 -31.09271886 -84.87625817 -14.83763318 191.09120883
## 773 774 775 778 779
## 42.86620493 130.66903539 -4.64114889 -61.92353900 -64.26665823
## 785 786 788 789 790
## -42.46655112 -38.23384154 -9.90907315 38.16719009 1.63353211
## 792 794 797 798 799
## -29.77057415 -70.72310074 -13.38060038 34.72569012 -85.14674773
## 801 802 803 805 806
## 0.84946547 15.58802747 -56.75740580 44.37281283 14.84856397
## 807 808 809 811 813
## -121.59068851 67.31197287 -19.71728091 -61.31131349 75.66540588
## 815 817 818 819 820
## -11.89230389 -4.25822254 0.36848090 47.34313247 1.70520499
## 823 824 825 826 828
## 8.56982203 -19.35189986 -22.82712861 26.70595759 67.29147063
## 830 831 835 836 839
## -16.10950145 -38.20539380 31.79280022 61.90552792 -123.88649728
## 841 842 843 844 845
## -29.29067902 44.74326926 21.29859506 -31.37777299 114.59894979
## 846 847 849 850 853
## 19.20043539 -79.00483446 -16.88553984 47.69966214 0.18686818
## 855 856 858 859 862
## 185.31782893 -33.64718902 -9.08592538 -121.71426653 124.98611438
## 863 864 866 867 868
## 98.88806891 -47.89726190 19.25044546 24.51994824 -57.74773858
## 869 871 872 874 875
## -37.26549894 -78.69933414 40.50752779 -54.14861478 55.38529315
## 878 880 885 886 887
## -32.55468472 4.34633891 -47.28644846 -1.14981348 -2.81909509
## 890 893 894 895 896
## -105.14539684 2.51278442 15.04865182 -25.04896029 -72.70514304
## 897 898 899 901 903
## -45.19433730 11.67551342 102.62431404 -122.16693233 51.22825666
## 905 906 907 908 910
## 46.78908576 117.74715193 -111.14410937 43.92360067 -42.14949758
## 911 913 914 915 916
## 105.92045191 -60.40019514 -56.64145487 -80.65832369 116.57507762
## 917 918 919 921 923
## 107.40574892 -19.49250832 -77.24476720 -84.55196530 -114.44844857
## 924 925 927 928 929
## 37.31654122 -90.40379677 -3.32977808 18.49989390 -52.40497380
## 930 932 933 935 936
## 22.09188443 -76.63213185 -123.87313117 30.25154014 40.37117700
## 937 938 940 941 942
## -8.20621163 49.78252616 11.94033951 77.11998124 63.81697925
## 947 951 952 953 954
## 91.81283808 18.82096447 -14.78948355 -30.44412259 151.41540930
## 955 956 957 959 960
## 176.63210955 -0.57903653 73.14837829 10.59480869 -107.95053786
## 963 964 965 967 968
## -53.88793804 106.33210561 -107.45068872 94.57317905 -47.69581429
## 970 971 973 974 977
## 9.73335197 -51.78760372 -4.15899437 -82.93363395 -71.14314629
## 978 980 981 982 984
## -29.91488167 85.53817014 115.84868364 86.35934886 3.86637897
## 986 987 989 990 992
## -5.91903508 -52.11301618 43.16588335 -64.16613041 -23.16281099
## 994 995 996 997 999
## 117.29792574 121.71090365 37.91865240 -18.25247388 24.68852348
## 1003 1004 1006 1007 1008
## 31.57499580 -28.50986477 23.95068006 26.50422806 -107.89939936
## 1009 1010 1011 1012 1013
## 12.34636880 -7.43732173 -18.05376435 2.37547272 -43.07494915
## 1015 1016 1017 1018 1020
## 44.03596570 13.14448132 -69.23641812 112.66876954 -66.99120291
## 1022 1025 1026 1028 1029
## -113.38540647 13.85314897 -111.66741178 54.14181957 53.00357098
## 1030 1032 1033 1034 1035
## 1.83842022 49.90235576 21.17762582 98.14697910 60.29191386
## 1036 1037 1038 1040 1041
## 124.06950686 20.68224801 -4.12308307 42.29431765 36.72707468
## 1042 1043 1045 1046 1047
## -73.86911671 146.06745390 114.48883468 82.39698178 51.16416291
## 1051 1052 1054 1056 1058
## -34.94586008 -65.00529779 13.01777662 -181.21402520 -135.78656192
## 1059 1063 1064 1065 1066
## 41.53815204 -55.25798261 34.22000277 151.97155827 31.09952735
## 1067 1068 1069 1070 1071
## -77.80692982 25.19685555 139.88584345 -164.08147615 50.29214675
## 1073 1075 1077 1081 1084
## 65.39495879 -11.16405154 -172.67640592 -126.26264926 92.61406285
## 1086 1087 1088 1089 1090
## 55.97055173 -148.35586401 43.94394924 86.11150823 40.45461308
## 1091 1092 1093 1094 1095
## 51.97079305 -105.85689806 -16.58320887 -79.90203005 62.12093493
## 1096 1097 1098 1099 1100
## 90.15492733 46.60350711 -141.49223485 -128.48348907 -63.23922835
## 1102 1104 1105 1106 1107
## -76.42065472 -47.25340676 32.04514593 -32.31463938 52.97203964
## 1108 1109 1110 1112 1114
## 7.16569118 44.58458017 -128.17470018 -40.51733243 93.34700549
## 1117 1119 1120 1122 1123
## -128.69049102 9.99330647 120.62138005 -69.49795415 47.92494386
## 1124 1125 1126 1127 1128
## -107.39631501 -143.26177936 -44.39915824 56.37808044 41.12164046
## 1132 1133 1135 1136 1138
## -54.36000789 -4.73615844 58.21803552 -5.66130729 83.68540944
## 1142 1143 1144 1145 1146
## -10.36948970 68.94772348 -48.72358427 -22.08495235 -13.76224241
## 1147 1150 1151 1153 1154
## -126.55026113 44.45887345 40.98172507 39.74395062 -67.15116278
## 1157 1158 1162 1163 1164
## -51.88745122 -4.65104672 8.81117096 52.75668495 -34.94982069
## 1165 1166 1168 1170 1172
## 72.04597233 2.60677026 14.52420282 51.37233372 138.48800659
## 1173 1174 1176 1177 1178
## -70.48614131 -116.09631493 -28.23841859 105.97254175 18.68647404
## 1179 1181 1182 1186 1187
## -0.80585825 81.62357828 -26.41234832 -96.92442573 -38.46006329
## 1189 1190 1191 1192 1193
## -10.24779125 51.02963163 51.46062527 -35.69583068 -66.53789983
## 1195 1198 1200 1201 1202
## -0.43721727 6.95919741 -38.52130629 -9.67630406 -67.93344213
## 1204 1205 1210 1211 1213
## 108.99194947 70.59565735 15.99038938 -28.34137109 -32.98035093
## 1215 1216 1217 1218 1219
## 100.58449112 2.16841425 -7.41280064 -103.61114084 7.55855639
## 1221 1223 1225 1226 1228
## -18.07819374 64.63505842 -15.24468966 27.66589412 95.22793294
## 1230 1231 1232 1233 1235
## -53.76798001 74.37561087 31.59032639 11.69625728 -0.55834700
## 1236 1237 1238 1239 1240
## 166.94097815 72.85148586 98.99809728 -2.61680323 -93.83001135
## 1243 1245 1246 1247 1248
## 30.27913876 82.52404484 50.74161880 7.56386084 21.13523232
## 1249 1251 1253 1254 1255
## 22.88563879 28.47765198 -45.91924374 -7.21653157 -29.60743997
## 1258 1259 1260 1261 1262
## 59.74421926 101.39802352 58.45216708 -143.19086149 -63.12731017
## 1263 1265 1267 1268 1269
## 71.90708644 -65.17521511 18.85758805 -11.49922934 -77.68214530
## 1270 1271 1272 1273 1274
## -106.20357374 -76.78856046 -30.00021982 34.74916032 -52.72723518
## 1277 1278 1279 1280 1282
## 101.72830681 119.44100230 -60.50979104 -88.89537162 75.76409589
## 1284 1285 1288 1289 1290
## 54.94605832 -0.12549430 -70.37639019 14.26688378 -54.57050478
## 1291 1292 1293 1294 1295
## 205.06691370 14.87357097 -45.38064118 -57.16468158 -53.95310037
## 1296 1297 1298 1299 1300
## -18.79110076 8.20945067 56.92727021 -52.84757136 35.59369503
## 1301 1303 1304 1305 1307
## -24.37164384 -17.54386281 -58.87299875 146.79753834 65.15084715
## 1308 1311 1312 1314 1315
## -87.63645820 12.44083276 88.67826530 64.83738291 3.16020303
## 1316 1317 1318 1320 1322
## -92.88458719 38.39071164 41.79682795 87.35230846 -66.67920369
## 1324 1325 1326 1327 1328
## -119.72419083 97.88463395 59.89983442 -44.81228804 82.28378507
## 1329 1330 1331 1334 1335
## 36.62046252 1.56450505 42.49340366 117.86957135 -46.95058946
## 1336 1337 1338 1339 1340
## 115.74484162 142.34259310 49.58040452 -119.76940803 -38.14369414
## 1342 1343 1344 1345 1347
## 119.30993192 -11.31823522 -41.18398075 -11.22749718 138.95756217
## 1349 1350 1353 1354 1355
## -58.54791189 13.36432676 -54.82758400 -0.74027036 -29.65411553
## 1356 1357 1358 1359 1361
## -10.11199071 -17.97273507 13.19397722 -63.34591130 -73.26128881
## 1362 1363 1364 1367 1368
## -20.25024108 10.08123558 -7.09205242 -86.04523259 95.62911926
## 1369 1371 1372 1373 1375
## 50.54737572 217.18038996 9.15634388 -3.62995744 132.92962739
## 1376 1378 1380 1381 1382
## -173.60542406 23.04929777 26.27917586 111.87573251 -13.80826466
## 1383 1385 1386 1388 1391
## 37.73661607 111.30293910 -24.43373624 110.82121131 -87.14595592
## 1392 1394 1395 1398 1399
## 26.80556405 3.16079438 -29.92574651 29.21203365 -47.52961329
## 1401 1402 1403 1404 1405
## -13.95430104 93.82849502 33.11848317 -129.99309933 114.52623586
## 1406 1408 1409 1411 1415
## 64.57785919 83.14952991 94.54651481 121.48797672 41.68624340
## 1416 1417 1419 1423 1424
## 85.41535485 103.71476283 1.90518148 -76.65263358 -20.17449623
## 1425 1426 1427 1428 1429
## -131.08378533 -24.89462548 10.24237265 -28.81146923 114.33638404
## 1430 1432 1434 1435 1438
## -24.11927605 24.68010030 -128.92024250 146.72090434 111.50362812
## 1439 1442 1443 1444 1445
## 41.59208684 -44.53906479 86.13938174 74.00323079 13.07834093
## 1446 1447 1450 1451 1452
## 68.97110563 81.61049019 13.75374079 125.56973895 -32.00456117
## 1453 1454 1455 1456 1457
## -59.53309933 -15.37374467 -49.85099817 -6.02880537 -24.10006916
## 1459 1460 1463 1464 1465
## -12.66408608 -79.05568465 18.41987438 -31.91297202 -50.53605773
## 1466 1468 1469 1470 1471
## 71.72461576 -25.08692507 31.17332570 25.60123808 -7.32420904
## 1473 1474 1476 1477 1480
## 64.91500964 -115.65161079 41.43343324 -6.81758213 -64.69969167
## 1481 1483 1485 1488 1491
## 107.17586613 109.34161088 -58.08059870 -88.76875076 34.89324042
## 1492 1493 1494 1495 1497
## 20.95094673 -63.38081393 -16.94874906 70.48403090 57.27411724
## 1501 1502 1503 1504 1505
## -109.70069771 8.89165276 -54.81237606 -123.69819213 36.08082141
## 1506 1507 1509 1510 1511
## -67.35253913 -79.14632421 -159.98711612 -62.59341621 -66.93405137
## 1512 1513 1515 1519 1521
## -114.84121953 73.05051164 -15.44498865 -8.00986570 -87.43748859
## 1522 1524 1525 1526 1527
## 43.72946324 -33.29804827 40.86055288 -174.29622691 -72.18473081
## 1529 1530 1531 1532 1533
## -39.59325455 6.38658636 85.32251917 76.74701983 121.81942183
## 1538 1539 1540 1541 1542
## -74.56121822 -73.89009112 -58.43307382 -69.38454509 -8.15751966
## 1544 1545 1546 1548 1549
## -48.07005197 41.61906012 34.08290161 98.78079339 118.61874990
## 1551 1552 1553 1557 1559
## 75.46783663 52.75899249 -125.06675580 60.83540944 -59.80542657
## 1561 1563 1564 1566 1567
## -9.94720566 132.68851025 56.45077198 60.51350387 106.75210880
## 1568 1570 1573 1574 1575
## 99.34095807 -113.13520270 -14.72417754 -2.89465630 -33.53604576
## 1577 1578 1579 1580 1581
## -45.09963188 16.28021461 -93.63454026 -48.82505431 -16.17129325
## 1583 1584 1585 1587 1594
## -0.19187065 62.85682494 78.75508549 35.17703403 -136.42815156
## 1596 1598 1599 1600 1601
## 39.48320966 24.13110944 -76.34963210 -51.30559209 80.83238470
## 1602 1604 1605 1608 1609
## 32.39885511 -110.45375821 131.18962578 36.67876991 53.44932693
## 1610 1611 1612 1616 1618
## 5.72094192 77.39520984 -19.50544551 -150.32710163 -156.59110451
## 1619 1620 1621 1623 1625
## -22.30990278 14.32881748 105.52129573 -44.29635401 -36.54111030
## 1627 1628 1629 1630 1631
## -131.52724707 -14.57228342 30.27010337 50.18153882 -28.55998393
## 1632 1634 1636 1638 1639
## 28.85907129 -27.80762137 -112.30629111 80.20196910 108.05209430
## 1640 1641 1642 1643 1644
## -48.29490944 -63.83027409 -98.21496499 -25.14082216 -33.70091238
## 1645 1649 1650 1651 1652
## -49.90781326 24.32805846 54.80651696 -78.30417434 45.44740168
## 1656 1657 1659 1660 1662
## -6.26310357 42.75196163 -131.29076261 30.92693464 -84.20431855
## 1663 1664 1665 1666 1667
## -30.24103195 -68.70444141 33.84307619 50.25205574 66.53317361
## 1668 1669 1670 1672 1673
## -10.32885777 16.83573308 -11.38678063 42.50965937 1.88239196
## 1674 1676 1677 1678 1682
## 50.72879236 11.05238947 -71.80247901 -78.56075583 71.28114022
## 1684 1685 1686 1687 1689
## -45.30160767 6.99122356 -116.12789057 -62.05891985 61.04286329
## 1691 1692 1695 1696 1698
## -104.56653727 34.39875830 37.59128763 -139.41635528 51.65086065
## 1699 1701 1702 1707 1708
## 5.63783777 90.56157990 83.64536273 -43.22256056 -89.89296224
## 1714 1715 1716 1717 1718
## 0.01101409 -42.46741975 -56.56279830 25.48497394 -53.94864772
## 1719 1720 1721 1722 1723
## 72.44150458 -33.21189496 31.24364444 31.68737355 16.29690758
## 1724 1725 1727 1728 1729
## 66.34511289 -13.50911751 -137.97589309 -54.52063972 49.96488293
## 1730 1731 1732 1733 1734
## -41.70353674 -110.98919447 18.26500384 -152.14098542 101.81799496
## 1736 1738 1740 1742 1743
## 124.10968228 8.53714536 -2.58564921 -51.47279840 -135.86490792
## 1744 1747 1748 1749 1750
## 9.31963767 56.48006818 -93.48559161 -73.71781557 83.18451175
## 1751 1752 1753 1754 1755
## -19.43722991 0.76702842 40.81708059 -2.95744997 -52.18243023
## 1756 1757 1759 1762 1763
## 44.14615013 -63.24952141 -247.43523893 -18.03552306 73.33407235
## 1764 1766 1767 1768 1769
## 88.64803869 -8.36904399 -72.22132957 -17.64484470 67.59513211
## 1770 1771 1772 1776 1777
## -49.61333757 1.78523026 -15.93887471 -113.97142849 -103.96894412
## 1778 1779 1780 1782 1783
## 36.50438743 -76.88161329 56.68759866 -7.89678134 68.59988190
## 1792 1793 1794 1795 1797
## -66.37280682 101.21751480 56.97424958 71.64124322 93.23454094
## 1799 1801 1802 1803 1804
## -50.73755801 101.01469597 20.74623648 24.54623688 39.80140982
## 1807 1809 1810 1811 1812
## 10.64522569 45.37840659 12.67052841 -28.81863802 -138.24625705
## 1814 1816 1817 1818 1823
## -79.79064251 123.09680393 -20.68575940 22.02130390 22.88957294
## 1824 1825 1826 1827 1828
## -91.32391939 -67.87948011 -26.95373537 -52.95038797 1.20496543
## 1830 1831 1833 1834 1835
## 97.13234652 -9.73134125 74.61780253 -20.32176848 31.65873014
## 1836 1837 1838 1839 1841
## 45.83643235 -11.47853641 -102.51732663 118.61724136 -47.78962950
## 1842 1844 1845 1846 1847
## -10.07866353 -103.57398032 29.09066867 -4.18115290 43.15044083
## 1848 1849 1850 1851 1852
## -36.55009552 20.15772326 -110.77480977 -0.22921545 -68.86496581
## 1854 1855 1856 1857 1858
## -91.35362869 -35.99724606 -155.38904343 32.28121992 -79.06783609
## 1859 1860 1862 1863 1864
## 13.85663968 -32.47884377 -3.03222270 -31.98227477 -51.98700338
## 1867 1868 1869 1871 1872
## 80.54235732 117.08769085 172.72301355 -61.32245243 152.72372799
## 1875 1876 1877 1879 1880
## 30.92129805 26.97879654 52.25932447 19.10290807 38.23304190
## 1881 1882 1883 1884 1885
## 89.59286869 -81.32719536 125.23953079 -82.40121943 -12.85382635
## 1886 1887 1889 1892 1893
## 75.28715598 -46.35377698 -162.06407322 68.09942412 78.38323712
## 1894 1895 1896 1897 1898
## 16.29197620 38.07052132 -11.89427747 7.66069858 10.10895566
## 1899 1900 1901 1904 1905
## 114.70573416 -34.47127263 -96.95116782 6.58925285 -123.49720921
## 1906 1908 1911 1916 1917
## 3.89509147 -32.03157386 23.45833852 135.53613202 -39.30271273
## 1920 1921 1922 1923 1924
## -45.24461256 108.71075931 2.74090720 35.45093889 -31.66934715
## 1926 1927 1930 1932 1934
## 22.90339355 20.22331144 -11.63826989 -48.62665003 -146.99543480
## 1935 1939 1940 1942 1944
## -27.10921412 -1.08136872 -16.07462997 -28.35040593 -140.75719713
## 1946 1948 1949 1950 1951
## 36.79085953 -65.51588086 -46.14480894 -3.93079549 -1.10422877
## 1952 1953 1954 1955 1956
## 31.93288894 28.86327966 -58.50599835 -11.18192498 53.82563636
## 1957 1959 1960 1961 1962
## 37.28327539 73.74490243 63.95092243 24.56253281 -59.80330404
## 1963 1964 1965 1966 1968
## -74.68125639 0.14724106 31.49556516 86.27235753 -34.07620723
## 1969 1973 1974 1975 1976
## -103.25911289 127.77510937 144.90952466 75.88455761 -77.48364431
## 1978 1979 1980 1981 1982
## 23.62183156 -53.45924076 -2.29548368 87.15949954 140.21529955
## 1986 1988 1989 1990 1993
## 21.53191833 -76.37611092 2.62549251 -41.94962580 10.62282736
## 1997 1998 2001 2002 2004
## 3.20234983 -55.12934608 39.53409039 43.14594742 -33.81985608
## 2009 2011 2012 2014 2015
## 40.75649643 -81.28368581 -64.47132124 0.95285042 121.59480712
## 2017 2019 2020 2021 2023
## 15.10293961 3.89550863 11.02832172 -12.15883955 -48.82914516
## 2024 2026 2028 2030 2031
## 17.80296232 -105.78715410 -37.12611782 -67.75791297 -4.81823999
## 2032 2033 2034 2035 2036
## 2.63402802 -21.68348822 -152.44785238 10.77431259 -10.80492742
## 2037 2038 2039 2040 2041
## -24.85414763 -48.32840010 19.37226975 23.31906863 -6.35281314
## 2042 2043 2045 2048 2050
## -147.04380134 0.79664929 -87.84082049 -34.00200727 -51.82671995
## 2051 2053 2054 2056 2057
## -103.93418528 -113.68172009 -52.70784024 46.23181514 59.03482172
## 2058 2059 2061 2062 2063
## 83.51376723 82.09349696 -17.19687867 66.39163327 -36.17882698
## 2066 2067 2068 2070 2071
## -33.86048149 -14.79869890 -195.11933862 12.77211671 -53.59347440
## 2072 2074 2078 2081 2082
## -116.43922907 -20.00043170 22.40766127 -59.43369364 11.17029274
## 2083 2084 2085 2086 2087
## 85.62598836 -55.18622232 -179.43927439 -109.44506569 116.13882978
## 2088 2090 2092 2093 2094
## 91.99851479 17.50799278 15.80766557 -72.98689743 42.70613848
## 2095 2096 2097 2099 2100
## 52.07635641 4.91677395 -62.26701123 -85.18438374 67.45994015
## 2101 2102 2103 2104 2105
## 75.73700925 33.88523285 48.61704938 7.35294747 2.59525413
## 2106 2107 2108 2110 2111
## 1.35484387 56.28003188 -17.58539787 39.41577049 36.16730955
## 2114 2115 2116 2118 2121
## 55.76294176 202.00162053 32.83455621 27.83247416 124.46712672
## 2122 2124 2125 2126 2127
## 4.46163087 32.69053150 27.49658724 -27.00405909 -14.24191843
## 2129 2130 2131 2132 2133
## 24.28097797 -37.88588466 97.22496257 -1.06109473 146.30360739
## 2134 2137 2138 2139 2140
## -75.62576708 73.58291737 13.79119955 -96.17869841 -117.39998282
## 2141 2145 2146 2147 2148
## -80.78059967 -0.57859660 101.14460690 40.21253202 148.04479821
## 2149 2150 2151 2152 2153
## 22.49801743 50.45308046 -106.39801720 70.39374056 30.04186666
## 2154 2156 2157 2158 2159
## 129.86994984 85.21536446 131.20743925 136.62014762 31.44642890
## 2162 2163 2165 2167 2172
## 23.90102119 -6.97847790 51.17806456 -17.41593953 -77.13726642
## 2175 2177 2178 2179 2181
## 8.48763679 -110.55459878 28.51322075 76.53279689 -78.45372612
## 2184 2185 2187 2188 2189
## -36.16116317 -38.28861906 -22.27690672 -56.84348090 35.89025155
## 2191 2193 2194 2195 2196
## -122.73052852 -14.50380120 -88.96226845 -81.45545657 16.69323757
## 2201 2205 2207 2208 2210
## 118.74500430 47.74660120 18.58864953 -17.78632560 -26.38000049
## 2211 2212 2215 2216 2218
## 28.70411221 -10.50812734 -60.63433134 22.63481282 0.83671868
## 2220 2224 2225 2227 2228
## -5.18816701 -128.84793175 43.19608355 -85.88825616 123.96457169
## 2233 2236 2237 2238 2241
## -12.49567665 113.70278980 81.93967758 12.31227642 71.27061932
## 2242 2244 2245 2246 2247
## -20.56086549 71.72576598 90.91647865 2.74750164 7.34515109
## 2248 2250 2251 2252 2253
## 40.66273589 45.58009453 15.69008168 -41.36693273 -77.83129849
## 2254 2255 2256 2258 2260
## -104.41593992 -12.41394447 9.86793103 -101.34135383 -130.79799055
## 2263 2264 2265 2266 2267
## 18.49343426 144.87545745 -132.05203732 84.17109818 -46.23005396
## 2268 2270 2272 2273 2274
## 13.02743325 -161.63534858 92.42858485 -23.61790418 -1.28351779
## 2275 2276 2278 2279 2280
## 99.79512072 25.58426079 -62.85243639 58.11023246 62.85480506
## 2281 2282 2284 2285 2287
## -77.09302011 -101.70858471 46.32426828 106.53447311 -9.52089132
## 2289 2290 2292 2294 2295
## 106.18270688 -108.13889063 128.75643441 13.13630883 55.64635259
## 2296 2300 2301 2302 2303
## 10.79332996 -65.56578966 35.17451182 -37.23912397 71.90387548
## 2304 2305 2306 2309 2310
## -12.37770993 82.36707200 -108.05408798 17.14000162 10.29764885
## 2311 2312 2313 2316 2318
## 4.43402863 -64.74084203 -97.86722395 19.55822620 124.94465534
## 2319 2321 2324 2326 2328
## -126.65478506 18.32582106 -10.16695311 -11.83610626 -145.22802427
## 2330 2331 2332 2333 2334
## -63.35997882 -100.40567855 -138.98485638 -0.92886322 -151.93644957
## 2335 2337 2339 2340 2341
## -25.61285986 -67.39170445 43.07963233 -191.28113523 -27.04539363
## 2342 2343 2344 2346 2348
## -83.36096166 133.46808968 1.25083008 -54.17513316 76.39085837
## 2349 2350 2351 2353 2355
## -217.51820373 -86.60088724 11.88973713 73.97094899 -112.14334369
## 2356 2359 2360 2361 2365
## -42.94358500 -8.93174839 -36.96534172 139.99312864 -8.54792150
## 2366 2367 2369 2371 2372
## 14.49564429 53.89329699 1.35699835 52.08667206 -121.45015874
## 2374 2375 2376 2377 2379
## -40.02869937 -22.31838952 54.46422182 -30.91039531 44.71964382
## 2381 2382 2386 2387 2392
## 6.15437761 1.57382633 -28.06517653 -61.21606473 -20.54049607
## 2394 2398 2399 2400 2401
## 50.59521255 113.81272838 74.31419504 16.68256654 -118.59569069
## 2402 2403 2405 2408 2410
## -69.08293361 125.58362755 -173.88107026 -43.60591447 -18.96336636
## 2411 2413 2414 2415 2416
## 18.07392923 24.65843503 -103.27693971 -1.61910668 -58.06123943
## 2417 2419 2420 2421 2422
## -46.07620595 23.87384476 -50.31699755 55.24505641 -157.42333879
## 2423 2424 2426 2427 2428
## 32.37315803 -53.80976741 -43.46447564 -120.38566969 84.86120458
## 2429 2430 2431 2432 2433
## -9.08774819 102.97407772 76.52428815 -36.52111081 -11.38978994
## 2434 2437 2438 2439 2440
## 83.81734575 36.22665121 -63.56405945 25.43901519 83.08565081
## 2441 2442 2443 2444 2447
## -38.78709652 -14.08212355 -147.83169815 -51.30745465 -40.43931320
## 2448 2449 2451 2453 2454
## 57.53316605 81.06114394 -2.64758213 -12.72839413 -16.20561542
## 2455 2460 2463 2464 2466
## 3.11489753 -218.10861412 -5.22694673 43.99674884 31.76921647
## 2467 2469 2471 2472 2473
## 117.73479821 -81.29154875 52.02642736 102.71739342 -49.01888819
## 2474 2476 2477 2478 2480
## -72.69952908 -1.69228114 84.89814471 24.20451060 37.92383323
## 2483 2484 2485 2486 2488
## -11.51558160 -170.35822553 -7.73696432 75.84676033 -79.35075749
## 2490 2491 2492 2494 2495
## -89.98140773 56.10002502 67.28613031 -1.65097336 31.58811772
## 2496 2497 2499 2500 2501
## -26.45908144 -19.03505694 -52.81484755 161.12396787 38.81628512
## 2502 2504 2505 2506 2510
## -100.89152466 25.65799278 77.55031184 -37.08682767 65.03742606
## 2512 2513 2514 2515 2516
## -32.62047501 -67.77961068 38.57374256 -51.23576337 -18.17603922
## 2517 2521 2522 2524 2527
## 31.51562065 -26.57119635 -12.03839118 18.22568705 72.67882822
## 2528 2530 2531 2532 2533
## -64.04427560 -13.05821651 30.51374734 48.74582620 41.53111096
## 2534 2535 2536 2538 2539
## 64.25655513 41.93942245 24.73359765 183.42345398 -66.40674129
## 2540 2541 2542 2543 2544
## -39.09780098 -76.74460232 -4.93575690 -98.56955926 103.19044850
## 2545 2546 2547 2548 2549
## 24.53222654 64.31947330 68.51457644 -14.21331836 -72.50105066
## 2551 2552 2554 2557 2558
## -48.30867091 64.40888511 35.97646646 -95.17873414 16.27054297
## 2560 2561 2563 2564 2565
## 116.32027086 -106.30395856 9.68817186 -36.22283677 -57.19807776
## 2566 2567 2568 2570 2571
## -31.55036028 35.75290266 -40.40340699 -15.03410222 -134.25029158
## 2572 2574 2575 2576 2579
## 14.46188660 20.55672741 5.73427794 48.29199671 -85.29716642
## 2580 2581 2582 2583 2584
## -25.49573109 -33.12258042 -71.09347218 40.88868172 -56.48736616
## 2585 2586 2587 2588 2590
## 102.44799802 18.14197061 63.19779658 -129.89771054 -56.42243486
## 2593 2594 2595 2596 2597
## -26.57029242 -38.75514826 -21.78665673 62.48720527 104.80437667
## 2598 2600 2602 2605 2606
## -1.15681087 -158.53150754 -15.96188792 58.44757828 -41.41069332
## 2607 2609 2611 2613 2615
## 109.20685079 47.77883228 70.07790958 127.09069784 -37.47222831
## 2617 2618 2619 2620 2622
## -17.81559827 -42.33282916 -35.08066977 -67.72807230 6.50178371
## 2623 2624 2625 2626 2627
## -145.68668721 86.12474093 -24.11672802 -153.85527292 -104.33693373
## 2628 2629 2632 2633 2639
## -98.05461061 -76.58660358 -60.03248050 -32.07456136 81.77109920
## 2640 2642 2643 2644 2646
## -114.44782923 -187.10805233 39.75938816 21.52984826 -108.45838878
## 2647 2648 2649 2650 2653
## 116.14818888 40.28397486 -70.96316480 5.16838294 -28.63620617
## 2654 2655 2656 2657 2660
## -119.81103119 62.71236074 3.93623817 123.75349953 8.19072495
## 2661 2663 2664 2667 2668
## -46.03207523 -75.36633025 -47.69172774 33.82753900 -0.66222800
## 2669 2672 2674 2675 2676
## -52.03873267 -127.39231415 -63.04404295 85.64820725 -34.57114177
## 2677 2681 2682 2683 2684
## 110.10719208 8.98068865 18.06196940 -71.03945070 -39.31740071
## 2688 2689 2690 2691 2693
## -19.37246989 71.74355710 -28.22747064 129.14094965 -48.81603751
## 2694 2696 2699 2701 2702
## 70.75362778 92.11423602 -11.32139536 -160.92156287 110.95009466
## 2704 2705 2708 2712 2716
## 26.41317969 -101.42253034 37.13592208 -201.11210744 -38.89815191
## 2717 2718 2719 2720 2722
## -85.65286151 -84.68192610 57.14428203 31.69321943 -89.66838764
## 2725 2727 2728 2729 2732
## 22.11410225 45.20909532 22.01626331 97.29451338 82.15040144
## 2733 2735 2737 2738 2739
## 110.77718929 64.32553297 1.93890897 30.18073521 166.12881114
## 2742 2743 2745 2746 2748
## -116.30702409 52.92649725 28.59366452 8.38036010 15.92520645
## 2750 2751 2752 2754 2755
## -0.12561928 76.91997193 -7.34560385 -32.43100570 139.90778719
## 2756 2757 2758 2760 2763
## -16.26765856 122.77577965 -5.00147734 -86.22944559 135.37676241
## 2765 2768 2769 2770 2771
## -64.69577194 29.86374819 -40.04468311 90.99922237 88.17535111
## 2772 2773 2774 2775 2776
## -65.53392523 66.35306757 -46.83091720 24.75913988 88.47608806
## 2777 2778 2780 2781 2782
## 97.51765809 48.76625810 -2.88016616 210.75774707 -64.45594383
## 2785 2786 2787 2790 2791
## 42.37419563 -1.08602988 -98.17688704 -31.28135490 97.94043247
## 2792 2794 2796 2797 2803
## 9.96021899 38.42463463 -32.05985144 145.95418049 -69.78515398
## 2804 2805 2807 2809 2810
## -105.97094501 33.67402034 12.33358299 -47.60188221 -80.46945334
## 2811 2812 2813 2814 2815
## 7.39940569 -6.48201060 56.85483412 19.81192128 -65.05620062
## 2816 2818 2819 2820 2821
## -5.18737349 24.16774798 3.67113475 72.43560213 -19.55131020
## 2824 2825 2826 2827 2830
## 106.32918001 39.60159495 -89.95580344 -3.09240735 16.89111795
## 2831 2832 2833 2834 2837
## -0.75919887 81.84951510 126.39629987 -48.96589663 -107.79879336
## 2838 2839 2840 2841 2843
## -23.68056438 -50.72277882 77.22033143 1.30549470 141.79492610
## 2844 2845 2846 2848 2849
## -26.08615698 53.42462605 21.17982513 -78.39361044 117.58951594
## 2850 2851 2852 2854 2855
## -28.63924679 -45.91234136 -83.15574656 26.11304830 7.14164108
## 2856 2857 2859 2860 2862
## -72.49978752 -54.26423402 15.02601495 -98.03737015 11.02408054
## 2863 2864 2865 2869 2871
## -83.47617098 -87.50983603 7.22838649 68.04703921 141.11797276
## 2872 2874 2877 2878 2879
## -51.72013737 -56.04559588 135.73026534 -110.52602935 -12.24079970
## 2881 2882 2884 2885 2887
## 74.10982779 -51.32540170 -72.09284323 -11.45326885 59.76678667
## 2888 2889 2890 2891 2892
## -83.08797582 3.33608982 -16.46833214 153.71176369 -44.48275992
## 2893 2895 2896 2899 2905
## -94.02717501 29.61582192 70.70872684 74.15869595 -95.51699022
## 2906 2907 2908 2909 2910
## 14.15177486 22.47124574 -106.92871042 -140.49187140 -70.39203019
## 2912 2913 2914 2915 2916
## 14.00801465 22.29333198 21.14187932 101.90174740 -16.94265150
## 2917 2918 2919 2920 2921
## 37.58242919 -41.32472691 3.09868213 0.07276469 61.34614682
## 2922 2925 2926 2927 2929
## 53.19840167 38.24224772 -80.50612093 76.56909323 -3.27240116
## 2930 2931 2933 2934 2935
## 3.54847220 82.98114055 27.97164445 25.88227585 44.60120733
## 2936 2937 2938 2941 2942
## 4.12783878 104.44455079 -23.98506983 8.23818888 -151.21986255
## 2944 2948 2949 2950 2952
## 17.29752511 52.91391045 -0.46369824 49.10687556 -145.18041633
## 2953 2954 2955 2956 2957
## -29.94636029 74.21516126 106.38679437 65.50177477 -13.23375612
## 2958 2959 2961 2962 2963
## 70.50063452 -131.30518703 43.90221078 77.39393005 -79.33137797
## 2964 2965 2967 2968 2970
## 77.48058462 25.42716832 91.17615782 35.27276057 -35.94971234
## 2972 2976 2977 2978 2980
## -61.83384781 -1.12911900 62.26012112 39.68146582 84.16584537
## 2981 2982 2985 2986 2988
## -172.85974294 -50.36400114 -100.39109312 66.86510701 19.42111949
## 2989 2990 2993 2994 2996
## -31.21567694 -57.90877381 -49.45139847 36.66893400 129.42082245
## 2998 2999 3001 3003 3004
## 192.45905636 -44.19050624 -31.16328549 39.59421055 97.16973416
## 3005 3006 3010 3012 3013
## -70.96444385 -8.78560407 136.41783375 117.92252852 18.47893809
## 3014 3017 3018 3019 3022
## -55.47103017 -83.01860310 31.61988796 -34.04627554 -48.66290416
## 3023 3024 3025 3026 3027
## 3.23290353 38.68175156 118.04380922 56.66995544 -57.17765098
## 3028 3030 3032 3034 3036
## -17.50620537 -45.85565145 55.69026261 96.76011856 48.98079020
## 3037 3039 3040 3043 3044
## -36.06826966 -8.88729635 -38.27882966 73.53483379 2.28117175
## 3045 3046 3047 3048 3049
## 9.15939412 -11.04385862 109.32412329 13.12148450 43.92272425
## 3053 3054 3055 3056 3057
## -27.73456521 38.46774198 -89.49974813 -43.36277083 68.67327224
## 3058 3059 3060 3061 3062
## -21.78994563 -169.57077623 31.32202113 -28.48798906 157.24952799
## 3064 3065 3066 3067 3069
## -58.67889746 -3.29194161 44.15314609 51.03292243 80.88739006
## 3070 3072 3074 3075 3076
## 7.30971247 49.86111541 38.82714087 60.09761164 -65.42786737
## 3077 3078 3079 3080 3084
## -103.28040973 -99.43801695 18.59560461 92.43558997 -153.59202683
## 3086 3088 3091 3092 3093
## 161.88829672 -118.22400800 80.89403447 -7.34244274 135.82047524
## 3095 3097 3099 3101 3102
## -31.46529259 193.60582701 -231.57897253 -8.77482642 -53.68689616
## 3103 3107 3109 3110 3111
## -61.56422025 -34.00302211 58.63749278 -63.47156160 13.12023017
## 3112 3113 3114 3115 3116
## 74.89616666 -35.15265309 4.27505636 35.27834853 -35.79735104
## 3120 3121 3123 3124 3125
## 14.13333562 -58.41135685 -27.80520452 -1.55719680 17.68948488
## 3127 3128 3130 3132 3133
## -34.03050874 -104.48699109 -76.67921747 32.36119636 15.68393728
## 3134 3135 3136 3137 3138
## -133.49118065 67.30730890 37.38983014 79.54351353 -47.09781050
## 3139 3140 3143 3144 3146
## 43.21654332 72.61743189 48.44352642 -5.55179046 142.34229612
## 3148 3149 3150 3152 3156
## -0.88337911 -108.55619230 66.25275888 -45.92961068 32.46449917
## 3157 3158 3160 3161 3162
## 84.72601329 -136.17343964 20.89486031 28.44232926 -22.74917889
## 3163 3164 3165 3166 3171
## 141.05557217 24.61193443 57.50818710 -10.53510436 56.16629181
## 3172 3173 3174 3175 3176
## -146.90349530 101.09044706 -21.82726403 -11.21160695 -15.75927719
## 3177 3178 3179 3180 3181
## 2.97977874 -28.07278390 -41.17646863 -126.12766576 -62.47283820
## 3182 3184 3186 3188 3191
## -75.10560005 86.96100747 36.74130095 0.50482107 -1.23330669
## 3192 3193 3194 3198 3199
## -12.52604566 13.88994899 88.56368236 67.90670511 8.57314471
## 3200 3202 3204 3207 3208
## -76.63699242 18.38578397 15.14929631 37.93429326 36.79895251
## 3210 3211 3212 3213 3216
## -49.10491536 66.21025933 72.77609570 -67.22025834 -64.77378841
## 3218 3221 3222 3226 3227
## -67.47354651 105.06865521 65.66870000 47.32293102 -155.31887016
## 3228 3229 3230 3232 3234
## -87.97506714 59.88601744 79.47598819 30.10907306 -15.59507077
## 3235 3236 3237 3238 3239
## 60.20334124 100.45629867 7.79127515 15.90126892 17.74691852
## 3240 3241 3242 3243 3245
## 32.96234980 -111.10016764 -36.77951845 -58.83588394 80.25509814
## 3246 3247 3248 3249 3250
## -45.82301023 -99.42909857 4.25805091 -12.39466050 45.45579935
## 3251 3252 3253 3255 3256
## -98.03921443 45.94713835 -6.02424675 81.21678718 -135.16470020
## 3257 3258 3259 3260 3261
## 17.82906964 11.45636872 68.43043598 100.87557341 31.49854417
## 3265 3266 3267 3269 3271
## 17.84206196 -11.84663659 40.63796051 89.99672281 -215.81717392
## 3272 3275 3276 3278 3280
## 40.14279028 66.92385797 72.93447457 -10.17277279 73.33089001
## 3281 3282 3283 3284 3285
## -169.98627602 -31.57488375 70.21548132 -94.85975885 2.52112718
## 3286 3287 3288 3290 3291
## -62.62577861 -70.62955033 -142.21838927 17.87876308 153.96602197
## 3292 3295 3297 3299 3300
## -132.07931255 -16.17541978 -30.90275882 -26.71589275 65.09276042
## 3301 3303 3304 3305 3306
## 56.36293632 31.20821716 -84.10458511 23.08422806 -46.40531370
## 3307 3308 3310 3312 3313
## 13.65225573 -28.57651373 -14.95281923 -10.19402496 42.43015577
## 3315 3317 3318 3319 3320
## 69.21619228 4.07216650 0.25751692 133.55543848 -60.86602798
## 3321 3323 3324 3325 3327
## 94.43225015 45.93147587 -12.84286495 3.59488451 -21.24199428
## 3331 3332 3333 3335 3337
## -116.29847041 -107.01266493 -62.98304665 -30.41496024 58.65430844
## 3338 3340 3342 3343 3344
## -30.59293553 34.72217256 -166.52131516 -21.22359154 -126.20277738
## 3347 3350 3351 3352 3353
## 120.89819200 -33.26813185 57.31767616 11.66863843 -168.33443950
## 3354 3356 3357 3359 3360
## -49.00193749 -127.25973125 71.03634492 -33.71783082 4.25168991
## 3365 3366 3367 3368 3370
## 34.63429248 -83.45855578 -26.92586643 30.84353091 -17.80461761
## 3371 3375 3376 3377 3380
## -37.85068683 11.52096286 52.14437107 49.44464199 78.95677892
## 3381 3382 3383 3385 3386
## 77.39335216 -63.05724561 -92.01292759 32.60475516 58.67016658
## 3387 3389 3390 3391 3393
## 42.28193203 -47.29933192 83.61310396 97.67839943 -206.90639519
## 3394 3395 3397 3398 3399
## 164.23506068 -57.48958822 -122.83667346 -88.95750924 129.92737691
## 3402 3403 3407 3408 3409
## -47.87743257 -81.18138197 -44.74784560 92.61345023 -22.84111946
## 3410 3411 3412 3414 3415
## 84.80248857 59.06691526 44.38343908 20.87291820 84.27423287
## 3417 3419 3420 3425 3426
## 51.89591797 153.67484435 148.31738115 50.40470162 75.36607909
## 3427 3428 3430 3432 3433
## -33.21673214 21.05135009 -25.22466375 20.56653026 50.22891007
## 3435 3436 3437 3439 3440
## -39.41465767 -231.21266826 45.09945999 -67.68847611 28.59038890
## 3441 3442 3445 3447 3449
## 44.43236043 -77.33296659 8.29457415 -56.96394364 -18.99141864
## 3450 3452 3454 3455 3456
## 167.18220585 -52.11848248 -79.46463393 66.44930735 24.75314409
## 3459 3460 3461 3462 3463
## 203.00350610 40.67209412 -104.84886263 22.43098615 115.96544373
## 3464 3465 3467 3468 3469
## 64.05528019 -67.80333508 19.93790403 158.82026089 -18.50900804
## 3470 3472 3473 3475 3479
## 56.13268933 75.35863902 25.13226382 45.90008992 -132.79441158
## 3482 3483 3485 3486 3488
## -23.40159749 13.22521838 -109.54723474 45.15945609 118.89758285
## 3490 3491 3493 3494 3495
## 9.16064016 -61.17900604 -114.38555407 -79.83886653 -124.63740262
## 3498 3502 3503 3504 3505
## -67.68539590 -202.52748468 33.32338110 -10.09688491 -73.21397899
## 3506 3508 3510 3511 3513
## 122.11188069 109.49097510 77.56993007 20.69876039 -130.84343532
## 3514 3517 3520 3521 3526
## 45.51527849 -68.32113877 101.90745678 12.01509346 93.90963000
## 3527 3528 3531 3532 3533
## 19.72059517 -58.07203761 79.76859341 28.62799924 -7.69132860
## 3536 3537 3538 3539 3540
## -99.90063296 -168.41577723 69.16818888 -88.38450003 72.37103453
## 3541 3542 3543 3544 3546
## -131.69474438 -112.58899443 90.19301757 48.76782212 -189.41261101
## 3548 3549 3550 3551 3553
## 49.55883043 -32.00655643 35.02084229 30.01341404 -81.98550662
## 3559 3560 3561 3562 3563
## -81.34755788 -51.42304119 66.19962942 10.44945109 -5.95144419
## 3565 3567 3568 3569 3571
## -184.37188995 -12.08513829 -45.10160497 -80.35313276 -15.13839988
## 3573 3576 3577 3578 3580
## -112.96489476 -109.07535587 -36.08806557 102.06141895 -8.96246806
## 3582 3583 3584 3585 3586
## -27.64277600 -87.41209917 59.24241445 -40.23420453 57.35833510
## 3588 3591 3593 3594 3596
## 15.02839138 -39.05423604 -24.03239490 -4.10719458 -14.65402609
## 3597 3598 3599 3600 3603
## 35.49430230 -9.07491924 -136.73885862 75.74881959 132.68561508
## 3604 3605 3606 3609 3612
## 61.80560461 80.61911789 -71.96844883 -11.35975159 -83.92607768
## 3613 3614 3615 3616 3618
## 77.74935795 -75.50896176 -82.63048526 -85.16222908 -24.90692088
## 3619 3620 3621 3622 3623
## 13.14622035 -90.27817317 77.41464562 -83.29736207 33.39253620
## 3624 3625 3626 3628 3629
## -55.57408547 -42.44761461 -92.33594669 52.46404267 -94.77250424
## 3631 3632 3633 3636 3638
## -104.80861360 105.30688034 68.34500681 -137.13614744 -42.77113340
## 3640 3641 3643 3645 3646
## 68.47115959 -127.78364818 -32.66147133 33.07570230 -73.40799729
## 3648 3650 3654 3655 3656
## 56.05384070 -26.96547365 -8.34856124 -24.38091974 85.68133728
## 3658 3660 3661 3663
## -2.69405972 -3.08990547 7.97710526 -5.51336867
# I would calculate RMSE by sqrt(sum(res$residuals^2) / res$df): https://stackoverflow.com/questions/43123462/how-to-obtain-rmse-out-of-lm-result
RMSE <- sqrt(sum(lmScore$residuals^2) / lmScore$df.residual)
RMSE
## [1] 73.81024
Using the “predict” function and supplying the “newdata” argument, use the lmScore model to predict the reading scores of students in pisaTest. Call this vector of predictions “predTest”. Do not change the variables in the model (for example, do not remove variables that we found were not significant in the previous part of this problem). Use the summary function to describe the test set predictions.
What is the range between the maximum and minimum predicted reading score on the test set?
library(caret)
## Loading required package: lattice
## Loading required package: ggplot2
predicTed <- predict(lmScore, newdata = pisaTest)
summary(predicTed)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 353.2 482.0 524.0 516.7 555.7 637.7
max(predicTed) - min(predicTed)
## [1] 284.4683
# SSE on testing dataset
SSE_1 <- sum((predicTed - pisaTest$readingScore)^2)
SSE_1
## [1] 5762082
RMSE_1 <- sqrt(mean((predicTed - pisaTest$readingScore)^2))
RMSE_1
## [1] 76.29079
baseline <- mean(pisaTrain$readingScore)
baseline
## [1] 517.9629
SST <- sum((pisaTest$readingScore - mean(pisaTrain$readingScore))^2)
SST
## [1] 7802354
R_sq <- 1 - SSE_1/SST
R_sq
## [1] 0.2614944
Before applying analytics tools on the training set, we first need to understand the data at hand. Load “FluTrain.csv” into a data frame called FluTrain. Looking at the time period 2004-2011, which week corresponds to the highest percentage of ILI-related physician visits? Select the day of the month corresponding to the start of this week.
FluTrain <- read.csv("FluTrain.csv")
str(FluTrain)
## 'data.frame': 417 obs. of 3 variables:
## $ Week : chr "2004-01-04 - 2004-01-10" "2004-01-11 - 2004-01-17" "2004-01-18 - 2004-01-24" "2004-01-25 - 2004-01-31" ...
## $ ILI : num 2.42 1.81 1.71 1.54 1.44 ...
## $ Queries: num 0.238 0.22 0.226 0.238 0.224 ...
hi_per <- subset(FluTrain, ILI == max(ILI))
hi_per
## Week ILI Queries
## 303 2009-10-18 - 2009-10-24 7.618892 1
Let us now understand the data at an aggregate level. Plot the histogram of the dependent variable, ILI. What best describes the distribution of values of ILI?
hist(FluTrain$ILI)
# plot nat log of ILI vs queries
When handling a skewed dependent variable, it is often useful to predict the logarithm of the dependent variable instead of the dependent variable itself – this prevents the small number of unusually large or small observations from having an undue influence on the sum of squared errors of predictive models. In this problem, we will predict the natural log of the ILI variable, which can be computed in R using the log() function.
Plot the natural logarithm of ILI versus Queries. What does the plot suggest?.
plot(x = FluTrain$Queries, y = log(FluTrain$ILI))
Based on the plot we just made, it seems that a linear regression model could be a good modeling choice. Based on our understanding of the data from the previous subproblem, which model best describes our estimation problem?
FluTrend1 <- lm(log(ILI) ~ Queries, data = FluTrain)
summary(FluTrend1)
##
## Call:
## lm(formula = log(ILI) ~ Queries, data = FluTrain)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.76003 -0.19696 -0.01657 0.18685 1.06450
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -0.49934 0.03041 -16.42 <2e-16 ***
## Queries 2.96129 0.09312 31.80 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.2995 on 415 degrees of freedom
## Multiple R-squared: 0.709, Adjusted R-squared: 0.7083
## F-statistic: 1011 on 1 and 415 DF, p-value: < 2.2e-16
cor(x = FluTrain$Queries, y = log(FluTrain$ILI))
## [1] 0.8420333
0.8420333^2
## [1] 0.7090201
The csv file FluTest.csv provides the 2012 weekly data of the ILI-related search queries and the observed weekly percentage of ILI-related physician visits. Load this data into a data frame called FluTest.
Normally, we would obtain test-set predictions from the model FluTrend1 using the code
PredTest1 = predict(FluTrend1, newdata=FluTest)
However, the dependent variable in our model is log(ILI), so PredTest1 would contain predictions of the log(ILI) value. We are instead interested in obtaining predictions of the ILI value. We can convert from predictions of log(ILI) to predictions of ILI via exponentiation, or the exp() function. The new code, which predicts the ILI value, is
PredTest1 = exp(predict(FluTrend1, newdata=FluTest))
What is our estimate for the percentage of ILI-related physician visits for the week of March 11, 2012? (HINT: You can either just output FluTest$Week to find which element corresponds to March 11, 2012, or you can use the “which” function in R. To learn more about the which function, type ?which in your R console.)
FluTest <- read.csv("FluTest.csv")
PredTest1 <- exp(predict(FluTrend1, newdata = FluTest))
FluTest$Week
## [1] "2012-01-01 - 2012-01-07" "2012-01-08 - 2012-01-14"
## [3] "2012-01-15 - 2012-01-21" "2012-01-22 - 2012-01-28"
## [5] "2012-01-29 - 2012-02-04" "2012-02-05 - 2012-02-11"
## [7] "2012-02-12 - 2012-02-18" "2012-02-19 - 2012-02-25"
## [9] "2012-02-26 - 2012-03-03" "2012-03-04 - 2012-03-10"
## [11] "2012-03-11 - 2012-03-17" "2012-03-18 - 2012-03-24"
## [13] "2012-03-25 - 2012-03-31" "2012-04-01 - 2012-04-07"
## [15] "2012-04-08 - 2012-04-14" "2012-04-15 - 2012-04-21"
## [17] "2012-04-22 - 2012-04-28" "2012-04-29 - 2012-05-05"
## [19] "2012-05-06 - 2012-05-12" "2012-05-13 - 2012-05-19"
## [21] "2012-05-20 - 2012-05-26" "2012-05-27 - 2012-06-02"
## [23] "2012-06-03 - 2012-06-09" "2012-06-10 - 2012-06-16"
## [25] "2012-06-17 - 2012-06-23" "2012-06-24 - 2012-06-30"
## [27] "2012-07-01 - 2012-07-07" "2012-07-08 - 2012-07-14"
## [29] "2012-07-15 - 2012-07-21" "2012-07-22 - 2012-07-28"
## [31] "2012-07-29 - 2012-08-04" "2012-08-05 - 2012-08-11"
## [33] "2012-08-12 - 2012-08-18" "2012-08-19 - 2012-08-25"
## [35] "2012-08-26 - 2012-09-01" "2012-09-02 - 2012-09-08"
## [37] "2012-09-09 - 2012-09-15" "2012-09-16 - 2012-09-22"
## [39] "2012-09-23 - 2012-09-29" "2012-09-30 - 2012-10-06"
## [41] "2012-10-07 - 2012-10-13" "2012-10-14 - 2012-10-20"
## [43] "2012-10-21 - 2012-10-27" "2012-10-28 - 2012-11-03"
## [45] "2012-11-04 - 2012-11-10" "2012-11-11 - 2012-11-17"
## [47] "2012-11-18 - 2012-11-24" "2012-11-25 - 2012-12-01"
## [49] "2012-12-02 - 2012-12-08" "2012-12-09 - 2012-12-15"
## [51] "2012-12-16 - 2012-12-22" "2012-12-23 - 2012-12-29"
which(FluTest$Week == "2012-03-11 - 2012-03-17")
## [1] 11
PredTest1[which(FluTest$Week == "2012-03-11 - 2012-03-17")]
## 11
## 2.187378
What is the relative error betweeen the estimate (our prediction) and the observed value for the week of March 11, 2012? Note that the relative error is calculated as
(Observed ILI - Estimated ILI)/Observed ILI
Obs_ILI <- FluTest$ILI[which(FluTest$Week == "2012-03-11 - 2012-03-17")]
Est_ILI <- PredTest1[which(FluTest$Week == "2012-03-11 - 2012-03-17")]
Rel_error <- (Obs_ILI - Est_ILI)/ Obs_ILI
Rel_error
## 11
## 0.04623827
What is the Root Mean Square Error (RMSE) between our estimates and the actual observations for the percentage of ILI-related physician visits, on the test set?
rmse_2 <- sqrt(mean((PredTest1 - FluTest$ILI)^2))
rmse_2
## [1] 0.7490645
To do so, we will use the “zoo” package, which provides a number of helpful methods for time series models. While many functions are built into R, you need to add new packages to use some functions. New packages can be installed and loaded easily in R, and we will do this many times in this class. Run the following two commands to install and load the zoo package. In the first command, you will be prompted to select a CRAN mirror to use for your download. Select a mirror near you geographically.
install.packages(“zoo”)
library(zoo)
After installing and loading the zoo package, run the following commands to create the ILILag2 variable in the training set:
ILILag2 = lag(zoo(FluTrain$ILI), -2, na.pad=TRUE)
FluTrain$ILILag2 = coredata(ILILag2)
In these commands, the value of -2 passed to lag means to return 2 observations before the current one; a positive value would have returned future observations. The parameter na.pad=TRUE means to add missing values for the first two weeks of our dataset, where we can’t compute the data from 2 weeks earlier.
How many values are missing in the new ILILag2 variable?
library(zoo)
##
## Attaching package: 'zoo'
## The following objects are masked from 'package:base':
##
## as.Date, as.Date.numeric
ILILag2 <- lag(zoo(FluTrain$ILI), -2, na.pad=TRUE)
FluTrain$ILILag2 <- coredata(ILILag2)
plot(x = log(FluTrain$ILI), y = log(FluTrain$ILILag2))
# Training a time series model
Train a linear regression model on the FluTrain dataset to predict the log of the ILI variable using the Queries variable as well as the log of the ILILag2 variable. Call this model FluTrend2.
Which coefficients are significant at the p=0.05 level in this regression model? (Select all that apply.)
FluTrend2 <- lm(log(ILI) ~ Queries + log(ILILag2), data = FluTrain)
summary(FluTrend2)
##
## Call:
## lm(formula = log(ILI) ~ Queries + log(ILILag2), data = FluTrain)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.52209 -0.11082 -0.01819 0.08143 0.76785
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -0.24064 0.01953 -12.32 <2e-16 ***
## Queries 1.25578 0.07910 15.88 <2e-16 ***
## log(ILILag2) 0.65569 0.02251 29.14 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.1703 on 412 degrees of freedom
## (2 observations deleted due to missingness)
## Multiple R-squared: 0.9063, Adjusted R-squared: 0.9059
## F-statistic: 1993 on 2 and 412 DF, p-value: < 2.2e-16
summary(FluTrend1)
##
## Call:
## lm(formula = log(ILI) ~ Queries, data = FluTrain)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.76003 -0.19696 -0.01657 0.18685 1.06450
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -0.49934 0.03041 -16.42 <2e-16 ***
## Queries 2.96129 0.09312 31.80 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.2995 on 415 degrees of freedom
## Multiple R-squared: 0.709, Adjusted R-squared: 0.7083
## F-statistic: 1011 on 1 and 415 DF, p-value: < 2.2e-16
So far, we have only added the ILILag2 variable to the FluTrain data frame. To make predictions with our FluTrend2 model, we will also need to add ILILag2 to the FluTest data frame (note that adding variables before splitting into a training and testing set can prevent this duplication of effort).
Modify the code from the previous subproblem to add an ILILag2 variable to the FluTest data frame. How many missing values are there in this new variable?
ILILag2.1 <- lag(zoo(FluTest$ILI), -2, na.pad=TRUE)
FluTest$ILILag2.1 <- coredata(ILILag2.1)
Fill in the missing values for ILILag2 in FluTest. In terms of syntax, you could set the value of ILILag2 in row “x” of the FluTest data frame to the value of ILI in row “y” of the FluTrain data frame with “FluTest\(ILILag2[x] = FluTrain\)ILI[y]”. Use the answer to the previous questions to determine the appropriate values of “x” and “y”. It may be helpful to check the total number of rows in FluTrain using str(FluTrain) or nrow(FluTrain).
What is the new value of the ILILag2 variable in the first row of FluTest?
str(FluTrain)
## 'data.frame': 417 obs. of 4 variables:
## $ Week : chr "2004-01-04 - 2004-01-10" "2004-01-11 - 2004-01-17" "2004-01-18 - 2004-01-24" "2004-01-25 - 2004-01-31" ...
## $ ILI : num 2.42 1.81 1.71 1.54 1.44 ...
## $ Queries: num 0.238 0.22 0.226 0.238 0.224 ...
## $ ILILag2: num NA NA 2.42 1.81 1.71 ...
nrow(FluTrain)
## [1] 417
#Thus
FluTest$ILILag2.1[1] = FluTrain$ILI[416]
FluTrain$ILI[416]
## [1] 1.852736
FluTest$ILILag2.1[2] = FluTrain$ILI[417]
FluTrain$ILI[417]
## [1] 2.12413
Obtain test set predictions of the ILI variable from the FluTrend2 model, again remembering to call the exp() function on the result of the predict() function to obtain predictions for ILI instead of log(ILI).
What is the test-set RMSE of the FluTrend2 model?
colnames(FluTest)[colnames(FluTest) == 'ILILag2.1'] <- 'ILILag2'
PredTest2 <- exp(predict(FluTrend2, newdata = FluTest))
RMSE_2 <- sqrt(mean((PredTest2 - FluTest$ILI)^2))
RMSE_2
## [1] 0.2942029
In this problem, we used a simple time series model with a single lag term. ARIMA models are a more general form of the model we built, which can include multiple lag terms as well as more complicated combinations of previous values of the dependent variable. If you’re interested in learning more, check out ?arima or the available online tutorials for these sorts of models.
ASSIGNMENT OFFICIALLY ENDS HERE – Optional tasks 2 more! if you wish