R Bridge Course Final Project This is a final project to show off what you have learned. Select your data set from the list below: http://vincentarelbundock.github.io/Rdatasets/ (click on the csv index for a list). Another good source is found here: https://https://archive.ics.uci.edu/ml/datasets.html The presentation approach is up to you but it should contain the following:

  1. Data Exploration: This should include summary statistics, means, medians, quartiles, or any other relevant information about the data set. Please include some conclusions in the R Markdown text.
##Loading the UN Stats Data set
UN.Stats <- read.csv("http://vincentarelbundock.github.io/Rdatasets/csv/carData/UN98.csv")
head(UN.Stats)
##Checking the type of data contained in the UN.Stats data frame.
str(UN.Stats)
'data.frame':   207 obs. of  14 variables:
 $ X                     : Factor w/ 207 levels "Afghanistan",..: 1 2 3 4 5 6 7 8 9 10 ...
 $ region                : Factor w/ 5 levels "Africa","America",..: 3 4 1 3 4 1 2 2 4 5 ...
 $ tfr                   : num  6.9 2.6 3.81 NA NA 6.69 NA 2.62 1.7 1.89 ...
 $ contraception         : int  NA NA 52 NA NA NA 53 NA 22 76 ...
 $ educationMale         : num  NA NA 11.1 NA NA NA NA NA NA 16.3 ...
 $ educationFemale       : num  NA NA 9.9 NA NA NA NA NA NA 16.1 ...
 $ lifeMale              : num  45 68 67.5 68 NA 44.9 NA 69.6 67.2 75.4 ...
 $ lifeFemale            : num  46 74 70.3 73 NA 48.1 NA 76.8 74 81.2 ...
 $ infantMortality       : int  154 32 44 11 NA 124 24 22 25 6 ...
 $ GDPperCapita          : int  2848 863 1531 NA NA 355 6966 8055 354 20046 ...
 $ economicActivityMale  : num  87.5 NA 76.4 58.8 NA NA 74.4 76.2 65 74 ...
 $ economicActivityFemale: num  7.2 NA 7.8 42.4 NA NA 56.2 41.3 52 53.8 ...
 $ illiteracyMale        : num  52.8 NA 26.1 0.264 NA NA NA 3.8 0.3 NA ...
 $ illiteracyFemale      : num  85 NA 51 0.36 NA NA NA 3.8 0.5 NA ...
summary(UN.Stats$region)
 Africa America    Asia  Europe Oceania 
     55      41      50      44      17 
##Calculating the average GDPperCapita of every region
library(plyr)
mu <- ddply(UN.Stats, "region", summarise, grp.mean=mean(GDPperCapita,na.rm = TRUE))
head(mu)
print("The data set contains some missing elements and are enoted by the label NA. Some functions do not behave well when asked to do operations on these elements. However, the functions have options to take care of these missing values. Basic statistical calculations like regional mean of GDP show how much difference there is the development and regional stastics. The world has developed unevenly.")
[1] "The data set contains some missing elements and are enoted by the label NA. Some functions do not behave well when asked to do operations on these elements. However, the functions have options to take care of these missing values. Basic statistical calculations like regional mean of GDP show how much difference there is the development and regional stastics. The world has developed unevenly."
  1. Data wrangling: Please perform some basic transformations. They will need to make sense but could include column renaming, creating a subset of the data, replacing values, or creating new columns with derived data (for example – if it makes sense you could sum two columns together)
##Subsetting data by geopraphical region
UN.Stats.Africa <- subset(UN.Stats, UN.Stats$region == "Africa")
UN.Stats.America <- subset(UN.Stats, UN.Stats$region == "America")
UN.Stats.Asia <- subset(UN.Stats, UN.Stats$region == "Asia")
UN.Stats.Europa <- subset(UN.Stats, UN.Stats$region == "Europa")
UN.Stats.Oceania <- subset(UN.Stats, UN.Stats$region == "Oceania")
##Calculating mean of regional GDPperCapita data and comparing Africa Vs. America
Africa.GDP <- mean(UN.Stats.Africa$GDPperCapita, na.rm = TRUE)
summary(UN.Stats.Africa$GDPperCapita)
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's 
   36.0   209.0   389.5  1196.0  1004.5 11854.0       1 
America.GDP <- mean(UN.Stats.America$GDPperCapita, na.rm = TRUE)
summary(UN.Stats.America$GDPperCapita)
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's 
    386    1749    2766    5398    7018   26037       1 
cat("The GDP of Africa is $", Africa.GDP, " compared to the GDP of America $", America.GDP)
The GDP of Africa is $ 1196  compared to the GDP of America $ 5398
  1. Graphics: Please make sure to display at least one scatter plot, box plot and histogram. Don’t be limited to this. Please explore the many other options in R packages such as ggplot2.
library(ggplot2)
require(ggplot2)
plot1<-ggplot(UN.Stats, aes(x=region,y=GDPperCapita,color=region))+ geom_point()
plot2<-ggplot(UN.Stats, aes(x=GDPperCapita, y=GDPperCapita,color=region))+ geom_boxplot()
plot3<-ggplot(UN.Stats, aes(x=GDPperCapita, color=region))+
  geom_histogram( fill="white", bins = 5)+
  facet_grid(.~region,scales='free')+theme(axis.text.x = element_text(angle = 90, hjust = 1))
plot1

plot2

plot3

  1. Meaningful question for analysis: Please state at the beginning a meaningful question for analysis. Use the first three steps and anything else that would be helpful to answer the question you are posing from the data set you chose. Please write a brief conclusion paragraph in R markdown at the end.
cat("Is there a correlation between low GDP per Capita and high Infant Mortality in the UN Data set?")
Is there a correlation between low GDP per Capita and high Infant Mortality in the UN Data set?
##Subsetting UN Stats Data
UN.Stats.Reduced <- UN.Stats[,c("X", "region","infantMortality","GDPperCapita")]
UN.Stats.Reduced
library(plyr)
mu <- ddply(UN.Stats.Reduced, "region", summarise, grp.mean=mean(GDPperCapita,na.rm = TRUE))
head(mu)
library(ggExtra)
plot1<-ggplot(UN.Stats, aes(x=GDPperCapita, y=infantMortality) ) +
  stat_density_2d(aes(fill = ..density..), geom = "raster", contour = FALSE) +
  scale_fill_distiller(palette= "Spectral", direction=1) +
  scale_x_continuous(expand = c(0, 0)) +
  scale_y_continuous(expand = c(0, 0)) + 
  theme(legend.position="none")
plot1

print("The 2D density plot shows that there is a higher incidence of infant mortality if the GDP per Capita is low. As shown in the bottom left of the plot.")
[1] "The 2D density plot shows that there is a higher incidence of infant mortality if the GDP per Capita is low. As shown in the bottom left of the plot."
  1. BONUS – place the original .csv in a github file and have R read from the link. This will be a very useful skill as you progress in your data science education and career.
UN.Stats <- read.csv("https://github.com/JMawyin/MSDS2019/blob/master/UN98.csv")
EOF within quoted string
head(UN.Stats)
LS0tCnRpdGxlOiAiSm9zZU1hd3lpbi1SLVczLTIwMTkiCmF1dGhvcjogIkpvc2UgTWF3eWluIgpkYXRlOiAiOC8wMS8yMDE5IgpvdXRwdXQ6IGh0bWxfbm90ZWJvb2sKLS0tClIgQnJpZGdlIENvdXJzZSBGaW5hbCBQcm9qZWN0ClRoaXMgaXMgYSBmaW5hbCBwcm9qZWN0IHRvIHNob3cgb2ZmIHdoYXQgeW91IGhhdmUgbGVhcm5lZC4gU2VsZWN0IHlvdXIgZGF0YSBzZXQgZnJvbSB0aGUgbGlzdCBiZWxvdzogaHR0cDovL3ZpbmNlbnRhcmVsYnVuZG9jay5naXRodWIuaW8vUmRhdGFzZXRzLyAoY2xpY2sgb24gdGhlIGNzdiBpbmRleCBmb3IgYSBsaXN0KS4gQW5vdGhlciBnb29kIHNvdXJjZSBpcyBmb3VuZCBoZXJlOgpodHRwczovL2h0dHBzOi8vYXJjaGl2ZS5pY3MudWNpLmVkdS9tbC9kYXRhc2V0cy5odG1sClRoZSBwcmVzZW50YXRpb24gYXBwcm9hY2ggaXMgdXAgdG8geW91IGJ1dCBpdCBzaG91bGQgY29udGFpbiB0aGUgZm9sbG93aW5nOgoKCjEuIERhdGEgRXhwbG9yYXRpb246IFRoaXMgc2hvdWxkIGluY2x1ZGUgc3VtbWFyeSBzdGF0aXN0aWNzLCBtZWFucywgbWVkaWFucywgcXVhcnRpbGVzLCBvciBhbnkgb3RoZXIgcmVsZXZhbnQgaW5mb3JtYXRpb24gYWJvdXQgdGhlIGRhdGEgc2V0LiBQbGVhc2UgaW5jbHVkZSBzb21lIGNvbmNsdXNpb25zIGluIHRoZSBSIE1hcmtkb3duIHRleHQuCmBgYHtyfQojI0xvYWRpbmcgdGhlIFVOIFN0YXRzIERhdGEgc2V0ClVOLlN0YXRzIDwtIHJlYWQuY3N2KCJodHRwOi8vdmluY2VudGFyZWxidW5kb2NrLmdpdGh1Yi5pby9SZGF0YXNldHMvY3N2L2NhckRhdGEvVU45OC5jc3YiKQpoZWFkKFVOLlN0YXRzKQoKIyNDaGVja2luZyB0aGUgdHlwZSBvZiBkYXRhIGNvbnRhaW5lZCBpbiB0aGUgVU4uU3RhdHMgZGF0YSBmcmFtZS4Kc3RyKFVOLlN0YXRzKQoKc3VtbWFyeShVTi5TdGF0cyRyZWdpb24pCgojI0NhbGN1bGF0aW5nIHRoZSBhdmVyYWdlIEdEUHBlckNhcGl0YSBvZiBldmVyeSByZWdpb24KbGlicmFyeShwbHlyKQptdSA8LSBkZHBseShVTi5TdGF0cywgInJlZ2lvbiIsIHN1bW1hcmlzZSwgZ3JwLm1lYW49bWVhbihHRFBwZXJDYXBpdGEsbmEucm0gPSBUUlVFKSkKaGVhZChtdSkKCnByaW50KCJUaGUgZGF0YSBzZXQgY29udGFpbnMgc29tZSBtaXNzaW5nIGVsZW1lbnRzIGFuZCBhcmUgZW5vdGVkIGJ5IHRoZSBsYWJlbCBOQS4gU29tZSBmdW5jdGlvbnMgZG8gbm90IGJlaGF2ZSB3ZWxsIHdoZW4gYXNrZWQgdG8gZG8gb3BlcmF0aW9ucyBvbiB0aGVzZSBlbGVtZW50cy4gSG93ZXZlciwgdGhlIGZ1bmN0aW9ucyBoYXZlIG9wdGlvbnMgdG8gdGFrZSBjYXJlIG9mIHRoZXNlIG1pc3NpbmcgdmFsdWVzLiBCYXNpYyBzdGF0aXN0aWNhbCBjYWxjdWxhdGlvbnMgbGlrZSByZWdpb25hbCBtZWFuIG9mIEdEUCBzaG93IGhvdyBtdWNoIGRpZmZlcmVuY2UgdGhlcmUgaXMgdGhlIGRldmVsb3BtZW50IGFuZCByZWdpb25hbCBzdGFzdGljcy4gVGhlIHdvcmxkIGhhcyBkZXZlbG9wZWQgdW5ldmVubHkuIikKYGBgCgoKMi4gRGF0YSB3cmFuZ2xpbmc6IFBsZWFzZSBwZXJmb3JtIHNvbWUgYmFzaWMgdHJhbnNmb3JtYXRpb25zLiBUaGV5IHdpbGwgbmVlZCB0byBtYWtlIHNlbnNlIGJ1dCBjb3VsZCBpbmNsdWRlIGNvbHVtbiByZW5hbWluZywgY3JlYXRpbmcgYSBzdWJzZXQgb2YgdGhlIGRhdGEsIHJlcGxhY2luZyB2YWx1ZXMsIG9yIGNyZWF0aW5nIG5ldyBjb2x1bW5zIHdpdGggZGVyaXZlZCBkYXRhIChmb3IgZXhhbXBsZSDigJMgaWYgaXQgbWFrZXMgc2Vuc2UgeW91IGNvdWxkIHN1bSB0d28gY29sdW1ucyB0b2dldGhlcikKYGBge3J9CiMjU3Vic2V0dGluZyBkYXRhIGJ5IGdlb3ByYXBoaWNhbCByZWdpb24KVU4uU3RhdHMuQWZyaWNhIDwtIHN1YnNldChVTi5TdGF0cywgVU4uU3RhdHMkcmVnaW9uID09ICJBZnJpY2EiKQpVTi5TdGF0cy5BbWVyaWNhIDwtIHN1YnNldChVTi5TdGF0cywgVU4uU3RhdHMkcmVnaW9uID09ICJBbWVyaWNhIikKVU4uU3RhdHMuQXNpYSA8LSBzdWJzZXQoVU4uU3RhdHMsIFVOLlN0YXRzJHJlZ2lvbiA9PSAiQXNpYSIpClVOLlN0YXRzLkV1cm9wYSA8LSBzdWJzZXQoVU4uU3RhdHMsIFVOLlN0YXRzJHJlZ2lvbiA9PSAiRXVyb3BhIikKVU4uU3RhdHMuT2NlYW5pYSA8LSBzdWJzZXQoVU4uU3RhdHMsIFVOLlN0YXRzJHJlZ2lvbiA9PSAiT2NlYW5pYSIpCgojI0NhbGN1bGF0aW5nIG1lYW4gb2YgcmVnaW9uYWwgR0RQcGVyQ2FwaXRhIGRhdGEgYW5kIGNvbXBhcmluZyBBZnJpY2EgVnMuIEFtZXJpY2EKQWZyaWNhLkdEUCA8LSBtZWFuKFVOLlN0YXRzLkFmcmljYSRHRFBwZXJDYXBpdGEsIG5hLnJtID0gVFJVRSkKc3VtbWFyeShVTi5TdGF0cy5BZnJpY2EkR0RQcGVyQ2FwaXRhKQpBbWVyaWNhLkdEUCA8LSBtZWFuKFVOLlN0YXRzLkFtZXJpY2EkR0RQcGVyQ2FwaXRhLCBuYS5ybSA9IFRSVUUpCnN1bW1hcnkoVU4uU3RhdHMuQW1lcmljYSRHRFBwZXJDYXBpdGEpCmNhdCgiVGhlIEdEUCBvZiBBZnJpY2EgaXMgJCIsIEFmcmljYS5HRFAsICIgY29tcGFyZWQgdG8gdGhlIEdEUCBvZiBBbWVyaWNhICQiLCBBbWVyaWNhLkdEUCkKYGBgCgoKMy4gR3JhcGhpY3M6IFBsZWFzZSBtYWtlIHN1cmUgdG8gZGlzcGxheSBhdCBsZWFzdCBvbmUgc2NhdHRlciBwbG90LCBib3ggcGxvdCBhbmQgaGlzdG9ncmFtLiBEb27igJl0IGJlIGxpbWl0ZWQgdG8gdGhpcy4gUGxlYXNlIGV4cGxvcmUgdGhlIG1hbnkgb3RoZXIgb3B0aW9ucyBpbiBSIHBhY2thZ2VzIHN1Y2ggYXMgZ2dwbG90Mi4KYGBge3IgZWNobz1UUlVFfQpsaWJyYXJ5KGdncGxvdDIpCnJlcXVpcmUoZ2dwbG90MikKcGxvdDE8LWdncGxvdChVTi5TdGF0cywgYWVzKHg9cmVnaW9uLHk9R0RQcGVyQ2FwaXRhLGNvbG9yPXJlZ2lvbikpKyBnZW9tX3BvaW50KCkKcGxvdDI8LWdncGxvdChVTi5TdGF0cywgYWVzKHg9R0RQcGVyQ2FwaXRhLCB5PUdEUHBlckNhcGl0YSxjb2xvcj1yZWdpb24pKSsgZ2VvbV9ib3hwbG90KCkKcGxvdDM8LWdncGxvdChVTi5TdGF0cywgYWVzKHg9R0RQcGVyQ2FwaXRhLCBjb2xvcj1yZWdpb24pKSsKICBnZW9tX2hpc3RvZ3JhbSggZmlsbD0id2hpdGUiLCBiaW5zID0gNSkrCiAgZmFjZXRfZ3JpZCgufnJlZ2lvbixzY2FsZXM9J2ZyZWUnKSt0aGVtZShheGlzLnRleHQueCA9IGVsZW1lbnRfdGV4dChhbmdsZSA9IDkwLCBoanVzdCA9IDEpKQoKcGxvdDEKcGxvdDIKcGxvdDMKCgpgYGAKCgo0LiBNZWFuaW5nZnVsIHF1ZXN0aW9uIGZvciBhbmFseXNpczogUGxlYXNlIHN0YXRlIGF0IHRoZSBiZWdpbm5pbmcgYSBtZWFuaW5nZnVsIHF1ZXN0aW9uIGZvciBhbmFseXNpcy4gVXNlIHRoZSBmaXJzdCB0aHJlZSBzdGVwcyBhbmQgYW55dGhpbmcgZWxzZSB0aGF0IHdvdWxkIGJlIGhlbHBmdWwgdG8gYW5zd2VyIHRoZSBxdWVzdGlvbiB5b3UgYXJlIHBvc2luZyBmcm9tIHRoZSBkYXRhIHNldCB5b3UgY2hvc2UuIFBsZWFzZSB3cml0ZSBhIGJyaWVmIGNvbmNsdXNpb24gcGFyYWdyYXBoIGluIFIgbWFya2Rvd24gYXQgdGhlIGVuZC4KYGBge3J9CmNhdCgiSXMgdGhlcmUgYSBjb3JyZWxhdGlvbiBiZXR3ZWVuIGxvdyBHRFAgcGVyIENhcGl0YSBhbmQgaGlnaCBJbmZhbnQgTW9ydGFsaXR5IGluIHRoZSBVTiBEYXRhIHNldD8iKQoKIyNTdWJzZXR0aW5nIFVOIFN0YXRzIERhdGEKVU4uU3RhdHMuUmVkdWNlZCA8LSBVTi5TdGF0c1ssYygiWCIsICJyZWdpb24iLCJpbmZhbnRNb3J0YWxpdHkiLCJHRFBwZXJDYXBpdGEiKV0KVU4uU3RhdHMuUmVkdWNlZAoKIyNDYWxjdWxhdGluZyB0aGUgbWVhbiBvZiBzdWJncm91cHMgKHJlZ2lvbnMpIHdpdGhpbmcgdGhlIFVOIFN0YXRzIGRhdGFiYWRlCmxpYnJhcnkocGx5cikKbXUgPC0gZGRwbHkoVU4uU3RhdHMuUmVkdWNlZCwgInJlZ2lvbiIsIHN1bW1hcmlzZSwgZ3JwLm1lYW49bWVhbihHRFBwZXJDYXBpdGEsbmEucm0gPSBUUlVFKSkKaGVhZChtdSkKCgpsaWJyYXJ5KGdnRXh0cmEpCnBsb3QxPC1nZ3Bsb3QoVU4uU3RhdHMsIGFlcyh4PUdEUHBlckNhcGl0YSwgeT1pbmZhbnRNb3J0YWxpdHkpICkgKwogIHN0YXRfZGVuc2l0eV8yZChhZXMoZmlsbCA9IC4uZGVuc2l0eS4uKSwgZ2VvbSA9ICJyYXN0ZXIiLCBjb250b3VyID0gRkFMU0UpICsKICBzY2FsZV9maWxsX2Rpc3RpbGxlcihwYWxldHRlPSAiU3BlY3RyYWwiLCBkaXJlY3Rpb249MSkgKwogIHNjYWxlX3hfY29udGludW91cyhleHBhbmQgPSBjKDAsIDApKSArCiAgc2NhbGVfeV9jb250aW51b3VzKGV4cGFuZCA9IGMoMCwgMCkpICsgCiAgdGhlbWUobGVnZW5kLnBvc2l0aW9uPSJub25lIikKCgpwbG90MQoKcHJpbnQoIlRoZSAyRCBkZW5zaXR5IHBsb3Qgc2hvd3MgdGhhdCB0aGVyZSBpcyBhIGhpZ2hlciBpbmNpZGVuY2Ugb2YgaW5mYW50IG1vcnRhbGl0eSBmb3IgdGhvc2UgY291bnRyaWVzIHdpdGggbG93ICBHRFAgcGVyIENhcGl0YSBpcyBsb3cuIFRoaXMgaXMgc2hvd24gYnkgdGhlIGNsdXN0ZXJpbmcgb2YgaGlnaCBpbmZhbnQgbW9ydGFsaXR5IGF0IHRoZSBib3R0b20gbGVmdCBvZiB0aGUgcGxvdC4gVGhlc2UgcmVzdWx0cyBzaG93IGdsb2JhbCByZXN1bHRzIGFuZCBkbyBub3Qgc2hvdyBob3cgcmVnaW9uIGFmZmVjdHMgdGhlIGxpbmsgYmV0d2VlbiBpbmZhbnQgbW9ydGFsaXR5IGFuZCBHRFAuICIpCgoKYGBgCgoKNS4gQk9OVVMg4oCTIHBsYWNlIHRoZSBvcmlnaW5hbCAuY3N2IGluIGEgZ2l0aHViIGZpbGUgYW5kIGhhdmUgUiByZWFkIGZyb20gdGhlIGxpbmsuIFRoaXMgd2lsbCBiZSBhIHZlcnkgdXNlZnVsIHNraWxsIGFzIHlvdSBwcm9ncmVzcyBpbiB5b3VyIGRhdGEgc2NpZW5jZSBlZHVjYXRpb24gYW5kIGNhcmVlci4KYGBge3J9ClVOLlN0YXRzIDwtIHJlYWQuY3N2KCJodHRwczovL2dpdGh1Yi5jb20vSk1hd3lpbi9NU0RTMjAxOS9ibG9iL21hc3Rlci9VTjk4LmNzdiIpCmhlYWQoVU4uU3RhdHMpCmBgYAoK