US Macroeconomic Data (1957–2005, Stock & Watson) https://vincentarelbundock.github.io/Rdatasets/doc/AER/USMacroSW.html https://vincentarelbundock.github.io/Rdatasets/csv/AER/USMacroSW.csv
# github csv location
csvfile <- 'https://raw.githubusercontent.com/dab31415/SPS-Bridge-R/main/USMacroSW.csv'
df <- read.csv(csvfile)
Use the summary function to gain an overview of the data set. Then display the mean and median for at least two attributes.
summary(df)
## X unemp cpi ffrate
## Min. : 1 Min. : 3.400 Min. : 27.78 Min. : 0.930
## 1st Qu.: 49 1st Qu.: 5.000 1st Qu.: 35.87 1st Qu.: 3.480
## Median : 97 Median : 5.700 Median : 87.93 Median : 5.400
## Mean : 97 Mean : 5.891 Mean : 91.73 Mean : 5.953
## 3rd Qu.:145 3rd Qu.: 6.833 3rd Qu.:143.07 3rd Qu.: 7.760
## Max. :193 Max. :10.667 Max. :192.17 Max. :19.100
## tbill tbond gbpusd gdpjp
## Min. : 0.830 Min. : 1.01 Min. :112.5 Min. : 10149
## 1st Qu.: 3.500 1st Qu.: 3.91 1st Qu.:159.6 1st Qu.: 57632
## Median : 5.080 Median : 5.62 Median :185.5 Median :254560
## Mean : 5.435 Mean : 6.04 Mean :204.9 Mean :259306
## 3rd Qu.: 6.740 3rd Qu.: 7.55 3rd Qu.:246.9 3rd Qu.:482328
## Max. :15.490 Max. :16.52 Max. :281.5 Max. :523638
sprintf('3-month treasury bill: mean = %.3f; median = %.3f',mean(df$tbill),median(df$tbill))
## [1] "3-month treasury bill: mean = 5.435; median = 5.080"
sprintf('1-year treasury bond: mean = %.3f; median = %.3f',mean(df$tbond),median(df$tbond))
## [1] "1-year treasury bond: mean = 6.040; median = 5.620"
Create a new data frame with a subset of the columns and rows. Make sure to rename it.
# select the first 50 rows, with columns x, unemp, tbill, and tbond
my_df <- df[1:50,c(1:2,5:6)]
Create new column names for the new data frame.
names(my_df) <- c('index','unemployement_rate','3-month tbill rate','1-year bond rate')
Use the summary function to create an overview of your new data frame. Then print the mean and median for the same two attributes. Please compare.
summary(my_df)
## index unemployement_rate 3-month tbill rate 1-year bond rate
## Min. : 1.00 Min. :3.400 Min. :0.830 Min. :1.230
## 1st Qu.:13.25 1st Qu.:3.875 1st Qu.:2.772 1st Qu.:3.098
## Median :25.50 Median :5.117 Median :3.500 Median :3.875
## Mean :25.50 Mean :5.008 Mean :3.603 Mean :4.054
## 3rd Qu.:37.75 3rd Qu.:5.625 3rd Qu.:4.410 3rd Qu.:4.970
## Max. :50.00 Max. :7.367 Max. :6.440 Max. :7.040
sprintf('3-month treasury bill: mean = %.3f; median = %.3f',mean(my_df$`3-month tbill rate`),median(my_df$`3-month tbill rate`))
## [1] "3-month treasury bill: mean = 3.603; median = 3.500"
sprintf('1-year treasury bond: mean = %.3f; median = %.3f',mean(my_df$`1-year bond rate`),median(my_df$`1-year bond rate`))
## [1] "1-year treasury bond: mean = 4.054; median = 3.875"
Interest rates for treasury bills and treasury bonds were lower in the first 50 quarters when compared with the original dataset.
For at least 3 values in a column please rename so that every value in that column is renamed. For example, suppose I have 20 values of the letter “e” in one column. Rename those values so that all 20 would show as “excellent”.
my_df$year <- 1957 + floor((my_df$index-1)/4)
my_df$qtr[(my_df$index-1) %% 4 == 0] <- 'First'
my_df$qtr[(my_df$index-1) %% 4 == 1] <- 'Second'
my_df$qtr[(my_df$index-1) %% 4 == 2] <- 'Third'
my_df$qtr[(my_df$index-1) %% 4 == 3] <- 'Fourth'
The dataset I selected wasn’t conducive to completing this exercise as written. I used similar selection techniques to populate a new column for quarter. I played around with a datetime field, but wasn’t having much luck.
Display enough rows to see examples of all of steps 1-5 above.
head(my_df,15)