HW Week 10. Exercises 2, 5-7 on page 272 and 273.








2. Download and library the nlme package and use data (“Blackmore”) to activate the Blackmore data set. Inspect the data and create a box plot showing the exercise level at different ages. Run a repeated measures ANOVA to compare exercise levels at ages 8, 10, and 12 using aov(). You can use a command like, myData <-Blackmore[Blackmore\(age <=12,], to subset the data. Keeping in mind that the data will need to be balanced before you can conduct this analysis, try running a command like this, table(myData\)subject,myData$age)), as the starting point for cleaning up the data set.





# Install the nlme package
# install.packages("nlme")

# Load the required package
library(nlme)

# Load the Blackmore dataset from the GitHub URL
url <- "https://raw.githubusercontent.com/vincentarelbundock/Rdatasets/master/csv/carData/Blackmore.csv"
Blackmore <- read.csv(url)

# Inspect the dataset
str(Blackmore)
## 'data.frame':    945 obs. of  5 variables:
##  $ rownames: int  1 2 3 4 5 6 7 8 9 10 ...
##  $ subject : chr  "100" "100" "100" "100" ...
##  $ age     : num  8 10 12 14 15.9 ...
##  $ exercise: num  2.71 1.94 2.36 1.54 8.63 0.14 0.14 0 0 5.08 ...
##  $ group   : chr  "patient" "patient" "patient" "patient" ...
summary(Blackmore)
##     rownames     subject               age           exercise     
##  Min.   :  1   Length:945         Min.   : 8.00   Min.   : 0.000  
##  1st Qu.:237   Class :character   1st Qu.:10.00   1st Qu.: 0.400  
##  Median :473   Mode  :character   Median :12.00   Median : 1.330  
##  Mean   :473                      Mean   :11.44   Mean   : 2.531  
##  3rd Qu.:709                      3rd Qu.:14.00   3rd Qu.: 3.040  
##  Max.   :945                      Max.   :17.92   Max.   :29.960  
##     group          
##  Length:945        
##  Class :character  
##  Mode  :character  
##                    
##                    
## 
# Create a box plot showing the exercise level at different ages
library(ggplot2)
ggplot(Blackmore, aes(x = factor(age), y = exercise)) +
  geom_boxplot() +
  labs(title = "Exercise Level at Different Ages", x = "Age", y = "Exercise Level")

# Subset the data for ages 8, 10, and 12
myData <- Blackmore[Blackmore$age <= 12, ]

# Check the balance of the data
table(myData$subject, myData$age)
##       
##        8 10 11.58 11.83 12
##   100  1  1     0     0  1
##   101  1  1     0     0  1
##   102  1  1     0     0  1
##   103  1  1     0     0  1
##   104  1  1     0     0  1
##   105  1  1     0     0  1
##   106  1  1     0     0  1
##   107  1  1     0     0  1
##   108  1  1     0     0  1
##   109  1  1     0     0  0
##   110  1  1     0     0  1
##   111  1  1     0     0  1
##   112  1  1     0     0  1
##   113  1  1     0     0  1
##   114  1  1     0     0  1
##   115  1  1     0     0  1
##   116  1  1     0     0  1
##   117  1  1     0     0  1
##   118  1  1     0     0  1
##   119  1  1     0     0  1
##   120  1  1     0     0  1
##   121  1  1     0     0  1
##   122  1  1     0     0  1
##   123  1  1     0     0  1
##   124  1  1     0     0  1
##   125  1  1     0     0  1
##   126  1  1     0     0  1
##   127  1  1     0     0  1
##   128  1  1     0     0  0
##   129  1  1     0     0  1
##   130  1  1     0     0  1
##   132  1  1     0     0  0
##   133  1  1     0     0  1
##   134  1  1     0     0  1
##   135  1  1     0     0  1
##   136  1  1     0     0  1
##   137  1  1     0     0  1
##   138  1  1     0     0  1
##   139  1  1     0     0  1
##   140  1  1     0     0  1
##   141  1  1     0     0  1
##   142  1  1     0     0  1
##   143  1  1     0     0  1
##   144  1  1     0     0  1
##   145  1  1     0     0  1
##   146  1  1     0     0  1
##   147  1  1     0     0  0
##   148  1  1     0     0  1
##   149  1  1     0     0  0
##   150  1  1     0     0  1
##   151  1  1     0     0  1
##   152  1  1     0     0  1
##   153  1  1     0     0  1
##   154  1  1     0     0  1
##   155  1  1     0     0  0
##   156  1  1     0     0  1
##   157  1  1     0     0  1
##   158  1  1     0     0  0
##   159  1  1     0     0  1
##   160  1  1     0     0  1
##   161  1  1     0     0  1
##   162  1  1     0     0  0
##   163  1  1     0     0  1
##   164  1  1     0     0  1
##   165  1  1     0     0  1
##   166  1  1     0     0  1
##   167  1  1     0     0  1
##   168  1  0     1     0  0
##   169  1  1     0     0  0
##   170  1  1     0     0  1
##   171  1  1     0     0  1
##   172  1  1     0     0  1
##   173  1  1     0     0  1
##   174  1  1     0     0  0
##   175  1  1     0     0  1
##   176  1  1     0     0  1
##   177  1  1     0     0  1
##   178  1  1     0     0  1
##   179  1  1     0     0  1
##   180  1  1     0     0  1
##   181  1  1     0     0  1
##   182  1  1     0     0  1
##   183  1  1     0     0  1
##   184  1  1     0     0  0
##   185  1  1     0     0  1
##   186  1  1     0     0  1
##   187  1  1     0     0  1
##   188  1  1     0     0  1
##   189  1  1     0     0  1
##   190  1  1     0     0  1
##   192  1  1     0     0  1
##   193  1  1     0     0  0
##   194  1  1     0     0  1
##   195  1  1     0     0  1
##   196  1  1     0     0  0
##   198  1  1     0     0  1
##   199  1  1     0     0  1
##   200  1  1     0     0  1
##   201  1  1     0     0  0
##   202  1  1     0     0  1
##   203  1  1     0     0  1
##   204  1  1     0     0  1
##   205  1  1     0     0  1
##   206  1  1     0     0  0
##   207a 1  1     0     0  1
##   207b 1  1     0     0  1
##   208  1  1     0     0  1
##   209  1  1     0     0  1
##   210  1  1     0     0  1
##   211  1  1     0     0  0
##   212  1  1     0     0  1
##   213  1  1     0     0  1
##   214  1  1     0     0  1
##   215  1  1     0     0  1
##   216  1  1     0     0  0
##   217  1  1     0     0  0
##   218  1  1     0     0  1
##   219  1  1     0     0  1
##   220  1  1     0     0  1
##   221  1  1     0     0  0
##   222  1  0     0     1  0
##   223  1  1     0     0  0
##   224  1  1     0     0  1
##   225  1  1     0     0  1
##   226  1  1     0     0  0
##   227  1  1     0     0  0
##   228  1  1     0     0  1
##   229a 1  1     0     0  1
##   229b 1  1     0     0  1
##   230  1  1     0     0  1
##   231  1  1     0     0  1
##   232  1  1     0     0  1
##   233  1  1     0     0  1
##   234  1  1     0     0  0
##   235  1  1     0     0  0
##   236  1  1     0     0  1
##   237  1  1     0     0  1
##   238  1  1     0     0  1
##   239  1  1     0     0  0
##   240  1  1     0     0  0
##   241  1  1     0     0  0
##   242  1  1     0     0  0
##   243  1  1     0     0  1
##   244  1  1     0     0  1
##   245  1  1     0     0  1
##   246  1  1     0     0  1
##   247  1  1     0     0  1
##   248  1  1     0     0  0
##   249  1  1     0     0  1
##   250  1  1     0     0  1
##   251  1  1     0     0  1
##   252  1  1     0     0  0
##   253  1  1     0     0  1
##   254  1  1     0     0  0
##   255  1  1     0     0  1
##   255b 1  1     0     0  1
##   256  1  1     0     0  1
##   257  1  1     0     0  1
##   258  1  1     0     0  0
##   259  1  1     0     0  1
##   260  1  1     0     0  0
##   261  1  1     0     0  0
##   262  1  1     0     0  0
##   263  1  1     0     0  0
##   264  1  1     0     0  0
##   265  1  1     0     0  0
##   266  1  1     0     0  0
##   267  1  1     0     0  1
##   268  1  1     0     0  1
##   269  1  1     0     0  0
##   270  1  1     0     0  1
##   271  1  1     0     0  1
##   272  1  1     0     0  0
##   273a 1  1     0     0  1
##   273b 1  1     0     0  1
##   274  1  1     0     0  0
##   275  1  1     0     0  1
##   276  1  1     0     0  1
##   277  1  1     0     0  1
##   278  1  1     0     0  0
##   279a 1  1     0     0  0
##   279b 1  1     0     0  1
##   280a 1  1     0     0  1
##   280b 1  1     0     0  1
##   281  1  1     0     0  1
##   282  1  1     0     0  1
##   283  1  1     0     0  0
##   284  1  1     0     0  1
##   285  1  1     0     0  0
##   286  1  1     0     0  1
##   300  1  1     0     0  1
##   301  1  1     0     0  1
##   302  1  1     0     0  1
##   303  1  1     0     0  1
##   304  1  1     0     0  1
##   305  1  1     0     0  1
##   306  1  1     0     0  1
##   307  1  1     0     0  1
##   308  1  1     0     0  1
##   309  1  1     0     0  1
##   310  1  1     0     0  1
##   311  1  1     0     0  1
##   312  1  1     0     0  1
##   313  1  1     0     0  1
##   314  1  1     0     0  1
##   315  1  1     0     0  0
##   316  1  1     0     0  0
##   317  1  1     0     0  1
##   318  1  1     0     0  1
##   319  1  1     0     0  1
##   320  1  1     0     0  1
##   321  1  1     0     0  1
##   322  1  1     0     0  0
##   323  1  1     0     0  1
##   324  1  1     0     0  0
##   325  1  1     0     0  1
##   326  1  1     0     0  1
##   327  1  1     0     0  1
##   328  1  1     0     0  1
##   329  1  1     0     0  0
##   330  1  1     0     0  1
##   331  1  1     0     0  1
##   332  1  1     0     0  1
##   333  1  1     0     0  1
##   334  1  1     0     0  1
##   335  1  1     0     0  1
##   336  1  1     0     0  1
##   337  1  1     0     0  0
##   338  1  1     0     0  1
##   340  1  1     0     0  1
##   341  1  1     0     0  0
# Filter the data for ages 8, 10, and 12
filtered_data <- Blackmore[Blackmore$age %in% c(8, 10, 12), ]

# Ensure that only subjects with observations at all three age points are included
# Find subjects with complete data
complete_subjects <- table(filtered_data$subject)
complete_subjects <- names(complete_subjects[complete_subjects == 3])

# Subset the data to include only complete subjects
cleaned_data <- filtered_data[filtered_data$subject %in% complete_subjects, ]

# Check the balance of the cleaned data
table(cleaned_data$subject, cleaned_data$age)
##       
##        8 10 12
##   100  1  1  1
##   101  1  1  1
##   102  1  1  1
##   103  1  1  1
##   104  1  1  1
##   105  1  1  1
##   106  1  1  1
##   107  1  1  1
##   108  1  1  1
##   110  1  1  1
##   111  1  1  1
##   112  1  1  1
##   113  1  1  1
##   114  1  1  1
##   115  1  1  1
##   116  1  1  1
##   117  1  1  1
##   118  1  1  1
##   119  1  1  1
##   120  1  1  1
##   121  1  1  1
##   122  1  1  1
##   123  1  1  1
##   124  1  1  1
##   125  1  1  1
##   126  1  1  1
##   127  1  1  1
##   129  1  1  1
##   130  1  1  1
##   133  1  1  1
##   134  1  1  1
##   135  1  1  1
##   136  1  1  1
##   137  1  1  1
##   138  1  1  1
##   139  1  1  1
##   140  1  1  1
##   141  1  1  1
##   142  1  1  1
##   143  1  1  1
##   144  1  1  1
##   145  1  1  1
##   146  1  1  1
##   148  1  1  1
##   150  1  1  1
##   151  1  1  1
##   152  1  1  1
##   153  1  1  1
##   154  1  1  1
##   156  1  1  1
##   157  1  1  1
##   159  1  1  1
##   160  1  1  1
##   161  1  1  1
##   163  1  1  1
##   164  1  1  1
##   165  1  1  1
##   166  1  1  1
##   167  1  1  1
##   170  1  1  1
##   171  1  1  1
##   172  1  1  1
##   173  1  1  1
##   175  1  1  1
##   176  1  1  1
##   177  1  1  1
##   178  1  1  1
##   179  1  1  1
##   180  1  1  1
##   181  1  1  1
##   182  1  1  1
##   183  1  1  1
##   185  1  1  1
##   186  1  1  1
##   187  1  1  1
##   188  1  1  1
##   189  1  1  1
##   190  1  1  1
##   192  1  1  1
##   194  1  1  1
##   195  1  1  1
##   198  1  1  1
##   199  1  1  1
##   200  1  1  1
##   202  1  1  1
##   203  1  1  1
##   204  1  1  1
##   205  1  1  1
##   207a 1  1  1
##   207b 1  1  1
##   208  1  1  1
##   209  1  1  1
##   210  1  1  1
##   212  1  1  1
##   213  1  1  1
##   214  1  1  1
##   215  1  1  1
##   218  1  1  1
##   219  1  1  1
##   220  1  1  1
##   224  1  1  1
##   225  1  1  1
##   228  1  1  1
##   229a 1  1  1
##   229b 1  1  1
##   230  1  1  1
##   231  1  1  1
##   232  1  1  1
##   233  1  1  1
##   236  1  1  1
##   237  1  1  1
##   238  1  1  1
##   243  1  1  1
##   244  1  1  1
##   245  1  1  1
##   246  1  1  1
##   247  1  1  1
##   249  1  1  1
##   250  1  1  1
##   251  1  1  1
##   253  1  1  1
##   255  1  1  1
##   255b 1  1  1
##   256  1  1  1
##   257  1  1  1
##   259  1  1  1
##   267  1  1  1
##   268  1  1  1
##   270  1  1  1
##   271  1  1  1
##   273a 1  1  1
##   273b 1  1  1
##   275  1  1  1
##   276  1  1  1
##   277  1  1  1
##   279b 1  1  1
##   280a 1  1  1
##   280b 1  1  1
##   281  1  1  1
##   282  1  1  1
##   284  1  1  1
##   286  1  1  1
##   300  1  1  1
##   301  1  1  1
##   302  1  1  1
##   303  1  1  1
##   304  1  1  1
##   305  1  1  1
##   306  1  1  1
##   307  1  1  1
##   308  1  1  1
##   309  1  1  1
##   310  1  1  1
##   311  1  1  1
##   312  1  1  1
##   313  1  1  1
##   314  1  1  1
##   317  1  1  1
##   318  1  1  1
##   319  1  1  1
##   320  1  1  1
##   321  1  1  1
##   323  1  1  1
##   325  1  1  1
##   326  1  1  1
##   327  1  1  1
##   328  1  1  1
##   330  1  1  1
##   331  1  1  1
##   332  1  1  1
##   333  1  1  1
##   334  1  1  1
##   335  1  1  1
##   336  1  1  1
##   338  1  1  1
##   340  1  1  1
# Perform repeated measures ANOVA on the cleaned data
anova_results <- aov(exercise ~ age + Error(subject/age), data = cleaned_data)
summary(anova_results)
## 
## Error: subject
##            Df Sum Sq Mean Sq F value Pr(>F)
## Residuals 175   1941   11.09               
## 
## Error: subject:age
##            Df Sum Sq Mean Sq F value   Pr(>F)    
## age         1   99.4   99.43   39.89 2.15e-09 ***
## Residuals 175  436.2    2.49                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Error: Within
##            Df Sum Sq Mean Sq F value Pr(>F)
## Residuals 176  197.5   1.122
# The significant p-value for the effect of age (2.15e-09) suggests that there is a significant change in exercise levels within subjects as they age. This means that exercise levels differ significantly at ages 8, 10, and 12 within individuals.





5. Given that the AirPassengers data set has a substantial growth trend, use diff() to create a differenced data set. Use plot() to examine and interpret the results of differencing. Use cpt.var() to find the change point in the variability of the differenced time series.

Plot the result and describe in your own words what the change point signifies.





# Install and load the changepoint package
#install.packages("changepoint")
library(changepoint)
## Loading required package: zoo
## 
## Attaching package: 'zoo'
## The following objects are masked from 'package:base':
## 
##     as.Date, as.Date.numeric
## Successfully loaded changepoint package version 2.2.4
##  See NEWS for details of changes.
# Load the AirPassengers dataset
data("AirPassengers")

# Create a differenced dataset
diff_airpassengers <- diff(AirPassengers)

# Plot the differenced dataset
plot(diff_airpassengers, main = "Differenced AirPassengers Data", ylab = "Differenced Passengers", xlab = "Time")

# Find the change point in the variability of the differenced time series
cpt_var_results <- cpt.var(diff_airpassengers)

# Plot the results
plot(cpt_var_results, main = "Change Point in Variability of Differenced AirPassengers Data")

# The plot indicates that there is a significant change in the variability of the differenced AirPassengers data around 1955. This change suggests that the pattern of variability in air passenger numbers became more pronounced and fluctuated more after this point in time.







6. Use cpt.mean() on the AirPassengers time series. Plot and interpret the results.Compare the change point of the mean that you uncovered in this case to the change point in the variance that you uncovered in Exercise 5. What do these change points suggest about the history of air travel?





# Find the change point in the mean of the AirPassengers time series
cpt_mean_results <- cpt.mean(AirPassengers)

# Plot the results
plot(cpt_mean_results, main = "Change Point in Mean of AirPassengers Data")

# Before the first change point (around 1951), the mean level of air passengers is relatively low and stable.
# After the first change point, there is a noticeable increase in the mean level of air passengers, indicating a shift to a higher average number of passengers.
#   The second change point (around 1955) indicates another increase in the mean level, showing a further rise in the average number of air passengers.







7. Find historical information about air travel on the Internet and/or in reference materials that sheds light on the results from Exercises 5 and 6. Write a mini-article (less than 250 words) that interprets your statistical findings from Exercises 5 and 6 in the context of the historical information you found.





Mini-Article: Historical Context of Air Travel Growth in the 1950s

The statistical analysis of the AirPassengers dataset revealed significant change points in the mean and variability of air passenger numbers during the early 1950s. Specifically, the mean level of air passengers saw notable increases around 1951 and 1955, while the variability of the differenced data also showed a significant change around 1955.

Several historical factors contribute to these findings. In the early 1950s, the commercial aviation industry experienced rapid growth driven by technological advancements and economic prosperity. The introduction of more efficient aircraft, such as the Douglas DC-6 and Lockheed Constellation, allowed airlines to offer faster and more reliable services. Additionally, the post-World War II economic boom increased disposable income, enabling more people to afford air travel.

Source: History of the Douglas DC-6

In 1951, the Federal Aviation Administration (FAA) was established in the United States, leading to improved safety standards and infrastructure developments. These changes boosted public confidence in air travel. The mid-1950s also saw the rise of airline marketing campaigns that promoted air travel as a glamorous and convenient mode of transportation, further increasing demand.

Source: History of the FAA

The spike in variability around 1955 can be attributed to the industry’s adaptation to rapidly growing demand and the challenges of scaling operations. Airlines expanded their fleets and routes, leading to fluctuations in passenger numbers as they adjusted to the new capacity.

Source: Airline Marketing in the 1950s

In conclusion, the statistical change points in air travel during the early 1950s align with significant historical developments in the aviation industry. Technological advancements, economic growth, and regulatory changes all contributed to the rapid increase in air passenger numbers, reflecting the broader trends of post-war modernization and economic expansion.

Source: Post-War Economic Boom