This practical accompanies the “Introduction to R” document MCom (Economics) students. Nothing in this task is for submission, it is simply an excercise to assist your learning.
Open R Studio and create a new project in which you will complete the assignment.
Ensure that you are working from the new project and download the necessary assignment materials using this link. Extract the data folder from the compressed folder you have downloaded and copy it to your root directory.
Hint: Run
getwd()in your console to check the file path of your working directory.
[yourstudentnumber].R in your directory. Write and execute the code necessary to complete the rest of the assignment in this R Script.Hint: In
.Rfiles, comments are created using the pound sign, i.e.,#.
Load the pacman package using install.packages("") (if necessary) and library().
Install/load the following packages using the pacman package:
tidyversehuxtablefixestreadxltsibblemFilterCreate a folder named ``data’’ in your current directory using R.
Directory already exists: ./data
url that takes the value below:https://www.columbia.edu/~mu2166/book/empirics/usg_data_quarterly.xls
Use the variable called url to save the excel spreadsheet at that location in the new directory
Complete the following path object, you can use the file name usg_data_quarterly.xls:
file_path <- file.path([directory where data is to be stored], [file name])
download.file([file's address on the internet] , [the file's path once downloaded], mode = "wb")
Create an array called countries, it should keep only the Country Name row from the data you imported
Replace the countries array with an array that contains only unique() values
Load the function below into R and then include your student number in the [student number] space below.
# Function to assign you a random country based on your student number
my_countries <- function(studentnumber, array) {
set.seed(studentnumber)
# Fixed first two numbers
country_index <- c(204, 174)
# Generate the third number, ensuring it is not 204 or 174
repeat {
third_number <- sample(1:214, 1) # Generate a random number
if (third_number != 204 && third_number != 174) break # Exit loop if valid
}
# Add the third number to the list of indices
country_index <- c(country_index, third_number)
# Get the corresponding countries
selected_countries <- array[country_index]
# Create a dataframe with the selected countries and their indices
return_countries <- data.frame(Index = country_index, Country = selected_countries)
return(return_countries)
}
keep_countries <- my_countries(123456789, countries)
keep_countries to usg_data using the data generated above. Call this new data frame raw_countriesYou will see that your dataset is unfortunately quite unusuable. Let’s fix that by reshaping the data to be in the standard panel format.
long_data <-raw_countries %>%
pivot_longer(
cols = `1960`:`2011`,
names_to = "Year",
values_to = "Value"
)
long_data <- long_data %>%
select(-`Indicator Name`) %>%
distinct() %>%
pivot_wider(
names_from = `Indicator Code`,
values_from = Value
)
I include code with which you can get labels for the variables in the dataset as well. You don’t have to comment on this:
# Attach custom attributes to store Indicator Name as metadata
for (code in unique(raw_countries$`Indicator Code`)) {
# Find the corresponding indicator name
label <- unique(raw_countries$`Indicator Name`[raw_countries$`Indicator Code` == code])
# Assign the label as an attribute to the corresponding column
attr(long_data[[code]], "label") <- label
}
Year variable of long_data to a numberlong_data <- long_data %>%
as_tsibble(index = Year, key = `Country Name`)
NY.GDP.PCAP.KN gdp_pcap, the variable must retain its label without any additional code. You may not create a new data frame, or change the data frame’s name for this task.cons_pcap that contains the Consumption per Capita in constant local currency units.gdi_pcap that contains the Investment per Capita in constant local currency units.Combine tasks 1-6, along with your comments, into a single tidyverse pipe.
Create a data frame called data_to_use that includes the country name, country code, year, as well as the three per Capita measures you created above. Ensure that data_to_use has no rows with empty values.
Hint: Use a tidyverse pipe that looks something like this
summary_stats <- data_to_use %>%
as_tibble() %>% # Temporarily remove the tsibble structure
group_by(...) %>%
summarise(...
)
# You can confirm you are on the right track with this
summary_stats
United States only. In all questions use the fixest command.2.1. \(lgdp\_pcap_t = \rho_0 + \tau Year_t\)
2.2. \(lgdp\_pcap_t + \rho_0 + \tau Year_t + \rho_1 lgdp\_pcap_{t-1}\)
2.3. \(\triangle lgdp\_pcap_t + \rho_0 + \tau Year_t + \rho_1 lgdp\_pcap_{t-1}\)
2.4. Combine all tables into a huxreg. It should look something like the below:
| (1) | (2) | (3) | |
|---|---|---|---|
| (Intercept) | -28.112 *** | -3.960 | -3.960 |
| (0.694) | (3.078) | (3.078) | |
| Year | 0.019 *** | 0.003 | 0.003 |
| (0.000) | (0.002) | (0.002) | |
| lgdp_pcap_lag | 0.835 *** | -0.165 | |
| (0.106) | (0.106) | ||
| N | 47 | 46 | 46 |
| R2 | 0.986 | 0.994 | 0.106 |
| logLik | 95.443 | 114.296 | 114.296 |
| AIC | -186.886 | -222.593 | -222.593 |
| *** p < 0.001; ** p < 0.01; * p < 0.05. | |||
2.5. Given your results above, would you say the data is stationary? Support your answer using graphs.
lgdp_pcap as lgdp_pcap_dlt. Create the quadratically detrended series and comment on your code. (Hint: you can specify poly(Year, n) to create a polyniomial of order n in the linear model environment. Ex. lm(y ~ poly(x,n)))data_to_use <- data_to_use %>%
group_by_key() %>%
mutate(
# Linear detrending
lgdp_pcap_ldt = residuals(lm(lgdp_pcap ~ Year, data = cur_data())),
) %>%
ungroup()
data_to_use <- data_to_use %>%
group_by_key() %>%
mutate(
# Quadratic detrending
lgdp_pcap_qdt = residuals(lm(lgdp_pcap ~ poly(Year,2), data = cur_data())),
) %>%
ungroup()
\lambda = 100 for all series, do the same for lcons_pcap. Create the same for all the series using the \lambda=6.25.