1. Create a treemap of the gni2014 data. This data represents the Gross National Income (per capita) in dollars and population totals per country in 2014. In R, the readRDS() function can be used to read in the data. Justify an appropriate hierarchical structure for the data and plot. Write a paragraph about your resulting plot’s features.

A: The reason that I chose the specific structure for my treemap was because I tried to rate the importance of the data. First, I used iso3 as the id. Since every country has an assigned character it made it easy to link up everything in the data set. Next, I used the population as the area in the treemap. I think the entire map should be able to visually represent how much significance the population has for each section. I grouped the map by continent because when I used a variable with too many names, it made the visulazition look too busy. I wanted to show the viewers something that was easy to understand for all individuals to be able to understand. Lastly, I assigned the color to the Gross National Income, because with everything nicely seperated, you are able to see the density pretty clearly.

# install.packages("tidyverse")
library("tidyverse")
## Warning: package 'tidyverse' was built under R version 3.5.1
## -- Attaching packages -------------------------------------------------------------------------------- tidyverse 1.2.1 --
## v ggplot2 3.1.0     v purrr   0.2.5
## v tibble  1.4.2     v dplyr   0.7.7
## v tidyr   0.8.2     v stringr 1.3.1
## v readr   1.1.1     v forcats 0.3.0
## Warning: package 'ggplot2' was built under R version 3.5.1
## Warning: package 'tibble' was built under R version 3.5.1
## Warning: package 'tidyr' was built under R version 3.5.1
## Warning: package 'readr' was built under R version 3.5.1
## Warning: package 'purrr' was built under R version 3.5.1
## Warning: package 'dplyr' was built under R version 3.5.1
## Warning: package 'stringr' was built under R version 3.5.1
## Warning: package 'forcats' was built under R version 3.5.1
## -- Conflicts ----------------------------------------------------------------------------------- tidyverse_conflicts() --
## x dplyr::filter() masks stats::filter()
## x dplyr::lag()    masks stats::lag()
# install.packages("portfolio")
library("portfolio")
## Warning: package 'portfolio' was built under R version 3.5.1
## Loading required package: grid
## Loading required package: lattice
## Loading required package: nlme
## 
## Attaching package: 'nlme'
## The following object is masked from 'package:dplyr':
## 
##     collapse
Gross <- readRDS("gni2014.Rda")

head(Gross)
##   iso3          country     continent population    GNI
## 3  BMU          Bermuda North America      67837 106140
## 4  NOR           Norway        Europe    4676305 103630
## 5  QAT            Qatar          Asia     833285  92200
## 6  CHE      Switzerland        Europe    7604467  88120
## 7  MAC Macao SAR, China          Asia     559846  76270
## 8  LUX       Luxembourg        Europe     491775  75990
map.market(id=Gross$iso3, area=Gross$population, group=Gross$continent,
           color=Gross$GNI, main="FlowingData Map")

2a.)Create time series bar plots of the past winners of the US Open golf tournament.

2b.) Create one plot that highlights all of the American winners.

2c.) Then create another plot that highlights all the years that it was won by someone who has won multiple times.

Golf <- read.csv("C:/Users/qm6639/Downloads/Golf.csv")
View(Golf)

A.)

Golf$'Total.Score' <- as.double(Golf$'Total.Score')
barplot(Golf$'Total.Score', names.arg=Golf$'Year', col="black", border=NA, xlab="Year", ylab="Total Score")

B.)

fill_colors <- c()
for ( i in 1:length(Golf$Country) ) {
  if (Golf$Country[i] =="United States") {
    fill_colors <- c(fill_colors, "grey")
  } else {
    fill_colors <- c(fill_colors, "green")
  }
}
barplot(Golf$'Total.Score', names.arg=Golf$'Year', col=fill_colors, border=NA, xlab="Year", ylab="Total Score")

C.)

fill_colors <- c()
for ( i in 1:length(Golf$Value) ) {
  if (!is.na(Golf$Value[i] == 1)) {
    fill_colors <- c(fill_colors, " tan")
  } else {
    fill_colors <- c(fill_colors, "black")
  }
}

barplot(Golf$'Value', names.arg=Golf$'Year', col=fill_colors, border=NA, xlab="Year", ylab="Times Each Champion Won")

  1. Time series. Consider the Air Passengers data; in R type data(“AirPassengers”). Find an appropriate decomposition for the data. Create ACF and PACF plots and assess whether it is white noise. Transform and difference the data as necessary to try and get a result that is close to white noise (this might not be perfect). Give a visualization that justifies this result, and comment.

A: In my opinion, there is more than 1 graph that signifies white noise. The two closest were pacf(AF) and pacf(trans). Several graphs were completely passed the area of significance like acf(AP), but others were also on the right track to suggest white noise, but where not the best fit. Multiplicative seasonality refers to seasonality that increases with the level of the series, which is why I chose it to represent the Air Passengers decomposition. The more passengers there are, the more seasonality is observed.

data("AirPassengers")
AP <- AirPassengers
acf(AP)

pacf(AP)

diff <- diff(AP)
trans <- transform(AP)

acf(diff)

acf(trans)

pacf(diff)

pacf(trans)

decompose_air = decompose(AP, "multiplicative")
plot(as.ts(decompose_air$seasonal))

plot(as.ts(decompose_air$trend))

plot(as.ts(decompose_air$random))

plot(decompose_air)