Consider the GDP information in global_economy. Plot the GDP per capita for each country over time. Which country has the highest GDP per capita? How has this changed over time?
| Country | Code | Year | GDP | Growth | CPI | Imports | Exports | Population |
|---|---|---|---|---|---|---|---|---|
| Afghanistan | AFG | 1960 | 537777811 | NA | NA | 7.024793 | 4.132233 | 8996351 |
| Afghanistan | AFG | 1961 | 548888896 | NA | NA | 8.097166 | 4.453443 | 9166764 |
| Afghanistan | AFG | 1962 | 546666678 | NA | NA | 9.349593 | 4.878051 | 9345868 |
| Afghanistan | AFG | 1963 | 751111191 | NA | NA | 16.863910 | 9.171601 | 9533954 |
| Afghanistan | AFG | 1964 | 800000044 | NA | NA | 18.055555 | 8.888893 | 9731361 |
| Afghanistan | AFG | 1965 | 1006666638 | NA | NA | 21.412803 | 11.258279 | 9938414 |
| Country | Code | Year | GDP | Growth | CPI | Imports | Exports | Population |
|---|---|---|---|---|---|---|---|---|
| Zimbabwe | ZWE | 2012 | 17114849900 | 16.6654288 | 107.3283 | 48.99928 | 25.16325 | 14710826 |
| Zimbabwe | ZWE | 2013 | 19091020000 | 1.9894928 | 109.0795 | 36.66874 | 21.98776 | 15054506 |
| Zimbabwe | ZWE | 2014 | 19495519600 | 2.3769293 | 108.8472 | 33.74147 | 20.93015 | 15411675 |
| Zimbabwe | ZWE | 2015 | 19963120600 | 1.7798727 | 106.2245 | 37.58864 | 19.16018 | 15777451 |
| Zimbabwe | ZWE | 2016 | 20548678100 | 0.7558693 | 104.5606 | 31.27549 | 19.94353 | 16150362 |
| Zimbabwe | ZWE | 2017 | 22040902300 | 4.7040354 | 105.5118 | 30.37027 | 19.65802 | 16529904 |
global_economy %>%
#R studio takes the GDP and pop from the global_economy dataset and divides them
autoplot(GDP/Population,show.legend = FALSE) +
labs(x = "Year", y = "GDP Per Capita")Monaco in 2014 has the highest GDP per capita, followed by Liechtenstein in 2014.
global_economy_sorted <- global_economy %>%
arrange(desc(GDP_per_Capita))
kable(head(global_economy_sorted))| Country | Code | Year | GDP | Growth | CPI | Imports | Exports | Population | GDP_per_Capita |
|---|---|---|---|---|---|---|---|---|---|
| Monaco | MCO | 2014 | 7060236168 | 7.1796368 | NA | NA | NA | 38132 | 185152.5 |
| Monaco | MCO | 2008 | 6476490406 | 0.7318007 | NA | NA | NA | 35853 | 180640.1 |
| Liechtenstein | LIE | 2014 | 6657170923 | NA | NA | NA | NA | 37127 | 179308.1 |
| Liechtenstein | LIE | 2013 | 6391735894 | NA | NA | NA | NA | 36834 | 173528.2 |
| Monaco | MCO | 2013 | 6553372278 | 9.5707988 | NA | NA | NA | 37971 | 172588.9 |
| Monaco | MCO | 2016 | 6468252212 | 3.2138488 | NA | NA | NA | 38499 | 168010.9 |
The data suggests that Monaco, Liechtenstein, and Luxembourg were primarily the top GDP per capita in 2012-2016.
highest_gdp_2012 <- global_economy_sorted %>%
filter(Year == 2012) %>%
arrange(desc(GDP_per_Capita)) %>%
slice(1:5)
highest_gdp_2013 <- global_economy_sorted %>%
filter(Year == 2013) %>%
arrange(desc(GDP_per_Capita)) %>%
slice(1:5)
highest_gdp_2014 <- global_economy_sorted %>%
filter(Year == 2014) %>%
arrange(desc(GDP_per_Capita)) %>%
slice(1:5)
highest_gdp_2015 <- global_economy_sorted %>%
filter(Year == 2015) %>%
arrange(desc(GDP_per_Capita)) %>%
slice(1:5)
highest_gdp_2016 <- global_economy_sorted %>%
filter(Year == 2016) %>%
arrange(desc(GDP_per_Capita)) %>%
slice(1:5) | Country | Code | Year | GDP | Growth | CPI | Imports | Exports | Population | GDP_per_Capita |
|---|---|---|---|---|---|---|---|---|---|
| Monaco | MCO | 2012 | 5743029680 | 0.9849603 | NA | NA | NA | 37783 | 152000.36 |
| Liechtenstein | LIE | 2012 | 5456009385 | NA | NA | NA | NA | 36545 | 149295.65 |
| Luxembourg | LUX | 2012 | 56677961787 | -0.3525194 | 106.1643 | 155.41771 | 186.44431 | 530946 | 106749.01 |
| Norway | NOR | 2012 | 510229136227 | 2.7216268 | 101.9908 | 27.54178 | 40.57434 | 5018573 | 101668.17 |
| Qatar | QAT | 2012 | 186833516484 | 4.6872592 | 103.4801 | 29.27387 | 76.47241 | 2109568 | 88564.82 |
| Country | Code | Year | GDP | Growth | CPI | Imports | Exports | Population | GDP_per_Capita |
|---|---|---|---|---|---|---|---|---|---|
| Liechtenstein | LIE | 2013 | 6391735894 | NA | NA | NA | NA | 36834 | 173528.15 |
| Monaco | MCO | 2013 | 6553372278 | 9.570799 | NA | NA | NA | 37971 | 172588.88 |
| Luxembourg | LUX | 2013 | 61739352212 | 3.654370 | 108.0053 | 158.61336 | 190.62858 | 543360 | 113625.13 |
| Norway | NOR | 2013 | 523502127660 | 1.044389 | 104.1535 | 28.35444 | 39.13876 | 5079623 | 103059.25 |
| Macao SAR, China | MAC | 2013 | 51552075902 | 11.200190 | 118.4497 | 30.40292 | 90.63795 | 575841 | 89524.84 |
| Country | Code | Year | GDP | Growth | CPI | Imports | Exports | Population | GDP_per_Capita |
|---|---|---|---|---|---|---|---|---|---|
| Monaco | MCO | 2014 | 7060236168 | 7.179637 | NA | NA | NA | 38132 | 185152.53 |
| Liechtenstein | LIE | 2014 | 6657170923 | NA | NA | NA | NA | 37127 | 179308.08 |
| Luxembourg | LUX | 2014 | 66327344189 | 5.771916 | 108.6841 | 174.06108 | 208.23038 | 556319 | 119225.38 |
| Norway | NOR | 2014 | 499338534779 | 1.975117 | 106.2800 | 29.78328 | 38.78287 | 5137232 | 97199.92 |
| Macao SAR, China | MAC | 2014 | 55347998648 | -1.201113 | 125.6103 | 31.62667 | 84.93632 | 588781 | 94004.39 |
| Country | Code | Year | GDP | Growth | CPI | Imports | Exports | Population | GDP_per_Capita |
|---|---|---|---|---|---|---|---|---|---|
| Liechtenstein | LIE | 2015 | 6268391521 | NA | NA | NA | NA | 37403 | 167590.61 |
| Monaco | MCO | 2015 | 6258178995 | 4.942330 | NA | NA | NA | 38307 | 163369.07 |
| Luxembourg | LUX | 2015 | 57784495265 | 2.861675 | 109.20011 | 187.46843 | 222.70321 | 569604 | 101446.79 |
| Switzerland | CHE | 2015 | 679289166858 | 1.226384 | 98.17176 | 50.62988 | 62.13828 | 8282396 | 82016.02 |
| Isle of Man | IMN | 2015 | 6792417112 | -0.900000 | NA | NA | NA | 83167 | 81672.02 |
| Country | Code | Year | GDP | Growth | CPI | Imports | Exports | Population | GDP_per_Capita |
|---|---|---|---|---|---|---|---|---|---|
| Monaco | MCO | 2016 | 6468252212 | 3.213849 | NA | NA | NA | 38499 | 168010.91 |
| Liechtenstein | LIE | 2016 | 6214633651 | NA | NA | NA | NA | 37666 | 164993.19 |
| Luxembourg | LUX | 2016 | 58631324559 | 3.082643 | 109.5177 | 186.1633 | 221.26778 | 582014 | 100738.68 |
| Switzerland | CHE | 2016 | 668745279605 | 1.375884 | 97.7451 | 54.5889 | 65.81131 | 8373338 | 79866.03 |
| Isle of Man | IMN | 2016 | 6592627599 | 7.400000 | NA | NA | NA | 83737 | 78730.16 |
For each of the following series, make a graph of the data. If transforming seems appropriate, do so and describe the effect.
United_States_GDP <- global_economy %>%
filter(Country == 'United States') %>%
arrange(desc(GDP_per_Capita))
plot <- autoplot(United_States_GDP) +
ggtitle("United States GDP") +
xlab("Year") +
ylab("GDP") +
theme_minimal() +
theme(
plot.title = element_text(face = "bold", size = 16, hjust = 0.5),
axis.title.x = element_text(face = "bold", size = 14),
axis.title.y = element_text(face = "bold", size = 14)
)
print(plot)bulls_bullocks_steers <- aus_livestock %>%
filter(Animal == 'Bulls, bullocks and steers', State == "Victoria")
plot <- autoplot(bulls_bullocks_steers) +
ggtitle("Bulls, Bullocks,and Steers in Victoria") +
xlab("Year") +
ylab("Count") +
theme_minimal() +
theme(
plot.title = element_text(face = "bold", size = 16, hjust = 0.5),
axis.title.x = element_text(face = "bold", size = 14),
axis.title.y = element_text(face = "bold", size = 14),
legend.title = element_text(size = 10), # Size of legend title
legend.text = element_text(size = 8), # Size of legend text
legend.key.size = unit(0.5, "cm") # Size of legend keys
)
print(plot)vic_elec_monthly <- vic_elec %>%
mutate(month = floor_date(Time, "month")) %>%
group_by(month) %>%
summarize(Demand = sum(Demand, na.rm = TRUE))
view(vic_elec_monthly)
plot <- autoplot(vic_elec_monthly) +
ggtitle("Victorian Electricity Demand") +
xlab("Year") +
ylab("Count") +
theme_minimal() +
theme(
plot.title = element_text(face = "bold", size = 16, hjust = 0.5),
axis.title.x = element_text(face = "bold", size = 14),
axis.title.y = element_text(face = "bold", size = 14),
legend.position = "none" # Remove the legend
)
print(plot)# Extract gas production data
gas_production <- aus_production %>%
select(Quarter, Gas)
plot <- autoplot(gas_production) +
ggtitle("Gas Prodiction") +
xlab("Quarter") +
ylab("Count") +
theme_minimal() +
theme(
plot.title = element_text(face = "bold", size = 16, hjust = 0.5),
axis.title.x = element_text(face = "bold", size = 14),
axis.title.y = element_text(face = "bold", size = 14),
legend.position = "none" # Remove the legend
)
print(plot)Why is a Box-Cox transformation unhelpful for the canadian_gas data?
## # A tsibble: 6 x 2 [1M]
## Month Volume
## <mth> <dbl>
## 1 1960 Jan 1.43
## 2 1960 Feb 1.31
## 3 1960 Mar 1.40
## 4 1960 Apr 1.17
## 5 1960 May 1.12
## 6 1960 Jun 1.01
plot <- autoplot(canadian_gas) +
ggtitle("Gas Prodiction") +
xlab("Month") +
ylab("Volume") +
theme_minimal() +
theme(
plot.title = element_text(face = "bold", size = 16, hjust = 0.5),
axis.title.x = element_text(face = "bold", size = 14),
axis.title.y = element_text(face = "bold", size = 14),
legend.position = "none" # Remove the legend
)
print(plot)lambda_gas <- canadian_gas %>%
features(Volume,features = guerrero) %>%
pull(lambda_guerrero)
lambda_gas## [1] 0.5767648
#base_plot <- autoplot(box_cox(canadian_gas$Volume, lambda_gas))
#plot <- base_plot +
# ggtitle("Gas Production") +
## xlab("Month") +
# ylab("Volume") +
# theme_minimal() +
# theme(
# plot.title = element_text(face = "bold", size = 16, hjust = 0.5),
# axis.title.x = element_text(face = "bold", size = 14),
# axis.title.y = element_text(face = "bold", size = 14),
# legend.position = "none" # Remove the legend
# )
#print(plot)The analysis can be unhelpful because the data from this period almost look identical, particularly between 1970 and 1980. During this time, there is a noticeable positive trend accompanied by a significant and swollen event, which can make it difficult to distinguish between different underlying patterns or trends.
What Box-Cox transformation would you select for your retail data (from Exercise 7 in Section 2.10)?
To effectively use the Box-Cox transformation, start by thoroughly understanding the nature of your data, including its trends, seasonality, and variance. Estimate the optimal lambda (\(\lambda\)) value using methods such as Guerrero’s, and then apply the Box-Cox transformation with this lambda to stabilize variance and improve normality. Finally, visualize the transformed data to assess the effectiveness of the transformation. It is essential to validate the lambda value and ensure that the transformation meets the needs of your data in practice.
For the following series, find an appropriate Box-Cox transformation in order to stabilise the variance. - - Tobacco from aus_production, - Economy class passengers between Melbourne and Sydney from ansett - Pedestrian counts at Southern Cross Station from pedestrian.
tobacco_ts <- ts(aus_production$Tobacco, frequency = 12)
lambda <- BoxCox.lambda(tobacco_ts)
transformed_tobacco <- BoxCox(tobacco_ts, lambda)plot <- autoplot(transformed_tobacco) +
ggtitle("Quarterly Tobacco Prodiction") +
xlab("Year and Quarters") +
ylab("Tobacco") +
theme_minimal() +
theme(
plot.title = element_text(face = "bold", size = 16, hjust = 0.5),
axis.title.x = element_text(face = "bold", size = 14),
axis.title.y = element_text(face = "bold", size = 14),
legend.position = "none" # Remove the legend
)
print(plot)ansett %>%
filter(Airports == "MEL-SYD" & Class == "Economy") %>%
autoplot(Passengers) + labs(title = "Passengers on Ansett Flights")lambda1 <- ansett %>%
filter(Airports == "MEL-SYD" & Class == "Economy") %>%
features(Passengers,features = guerrero) %>%
pull(lambda_guerrero)
ansett %>%
filter(Airports == "MEL-SYD" & Class == "Economy") %>%
autoplot(box_cox(Passengers,lambda1)) + labs(title = "BoxCox of Passengers")Consider the last five years of the Gas data from aus_production.
| Gas | Quarter |
|---|---|
| 221 | 2005 Q3 |
| 180 | 2005 Q4 |
| 171 | 2006 Q1 |
| 224 | 2006 Q2 |
| 233 | 2006 Q3 |
| 192 | 2006 Q4 |
| 187 | 2007 Q1 |
| 234 | 2007 Q2 |
| 245 | 2007 Q3 |
| 205 | 2007 Q4 |
| 194 | 2008 Q1 |
| 229 | 2008 Q2 |
| 249 | 2008 Q3 |
| 203 | 2008 Q4 |
| 196 | 2009 Q1 |
| 238 | 2009 Q2 |
| 252 | 2009 Q3 |
| 210 | 2009 Q4 |
| 205 | 2010 Q1 |
| 236 | 2010 Q2 |
plot <- autoplot(gas) +
ggtitle("5 years of Gas production") +
xlab("Quarters") +
ylab("Gas Production") +
theme_minimal() +
theme(
plot.title = element_text(face = "bold", size = 16, hjust = 0.5),
axis.title.x = element_text(face = "bold", size = 14),
axis.title.y = element_text(face = "bold", size = 14),
legend.position = "none" # Remove the legend
)
print(plot)Yes definitly we can spot the highs for the middle of the year and the lows for at the begging. Positive trend starting the year the negative trend toward the end of the year.
decompostion_gas <- gas %>%
model(
classical_decomposition(Gas,type = "multiplicative")) %>%
components()
decompostion_gas %>%
autoplot() + labs(title = "Classical Decomposition of Gas")Yes the results seem to follow the original data.
decompostion_gas %>%
ggplot(aes(x = Quarter)) +
geom_line(aes(y = Gas,color = "Data")) +
geom_line(aes(y = season_adjust,color = "Seasonally Adjusted")) +
geom_line(aes(y = trend,color = "Trend")) +
labs(title = "Quarterly production of Gas") +
scale_colour_manual(
values = c("#999999", "green", "red"),
breaks = c("Data", "Seasonally Adjusted", "Trend")
)| Gas | Quarter |
|---|---|
| 521 | 2005 Q3 |
| 180 | 2005 Q4 |
| 171 | 2006 Q1 |
| 224 | 2006 Q2 |
| 233 | 2006 Q3 |
| 192 | 2006 Q4 |
| 187 | 2007 Q1 |
| 234 | 2007 Q2 |
| 245 | 2007 Q3 |
| 205 | 2007 Q4 |
| 194 | 2008 Q1 |
| 229 | 2008 Q2 |
| 249 | 2008 Q3 |
| 203 | 2008 Q4 |
| 196 | 2009 Q1 |
| 238 | 2009 Q2 |
| 252 | 2009 Q3 |
| 210 | 2009 Q4 |
| 205 | 2010 Q1 |
| 236 | 2010 Q2 |
Adding 300 to a single observation in the data significantly skews its normal distribution. Most of the added values do not reach the peak of 300; they only go up to around 250. Therefore, this addition makes a substantial difference in the data.
In both cases, it would be incorrect, but it would cause a significant increase in the positive trend shown on the chart. Initially, it would create a positive trend at the beginning, but by the end, it would result in a negative trend.
Recall your retail time series data (from Exercise 7 in Section 2.10). Decompose the series using X-11. Does it reveal any outliers, or unusual features that you had not noticed previously?
## # A tsibble: 218 x 7 [1Q]
## Quarter Beer Tobacco Bricks Cement Electricity Gas
## <qtr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 1956 Q1 284 5225 189 465 3923 5
## 2 1956 Q2 213 5178 204 532 4436 6
## 3 1956 Q3 227 5297 208 561 4806 7
## 4 1956 Q4 308 5681 197 570 4418 6
## 5 1957 Q1 262 5577 187 529 4339 5
## 6 1957 Q2 228 5651 214 604 4811 7
## 7 1957 Q3 236 5317 227 603 5259 7
## 8 1957 Q4 320 6152 222 582 4735 6
## 9 1958 Q1 272 5758 199 554 4608 5
## 10 1958 Q2 233 5641 229 620 5196 7
## # ℹ 208 more rows
Figures 3.19 and 3.20 show the result of decomposing the number of persons in the civilian labour force in Australia each month from February 1978 to August 1995.
The line graph of the values shows an increase from the 1980s to the mid-1990s. There appears to be a small disturbance around 1992, where the value line dips slightly. After this point, the value clearly drops below a count of 400. The seasonal pattern remains consistent with no notable deviations. The trend line indicates a positive overall trend.
Yes, the recession is clearly visible in the graph. The small disturbance is evident during the 1991/1992 period.