title: “Winter Olympics Medals over Time” author: “JunyuMeng-jm4655” date: “2018年2月16日” output: html_document html_document: keep_md: true —

1. Medal Counts over Time

I use the point plot to show the total medal numbers of medals owned by a country and reorder different by its number of total medals. And I use different blue colors to indicates the changes with the time. In this way, reader could not only clearly know the different medal numbers of different countries, but also the changes with time. As the plot showing, most of the countries have made progress with the time while some outliers even won less medals than before. Russia,Germany and USA is the counties who have large total numbers. In addition, I use grid to divide the medal into different sorts. We could compare diffetert countries’ different kinds of medals.For example, althrough Russia is the country has the most total medals, it has most Gold medals within a year rather than silver and bronze.Finland has the most bronze within a year.

## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
## -- Attaching packages --------------------------------------------------------------- tidyverse 1.2.1 --
## √ ggplot2 2.2.1     √ readr   1.1.1
## √ tibble  1.4.2     √ purrr   0.2.4
## √ tidyr   0.8.0     √ stringr 1.2.0
## √ ggplot2 2.2.1     √ forcats 0.2.0
## -- Conflicts ------------------------------------------------------------------ tidyverse_conflicts() --
## x dplyr::filter() masks stats::filter()
## x dplyr::lag()    masks stats::lag()
## # A tibble: 355 x 3
## # Groups:   Year [?]
##     Year Country sum_medal
##    <int> <fct>       <int>
##  1  1924 AUT             4
##  2  1924 BEL             5
##  3  1924 CAN             9
##  4  1924 FIN            15
##  5  1924 FRA            12
##  6  1924 GBR            25
##  7  1924 NOR            17
##  8  1924 SUI             9
##  9  1924 SWE             9
## 10  1924 USA            13
## # ... with 345 more rows
## # A tibble: 99 x 3
## # Groups:   Medal [?]
##    Medal  Country sum_type
##    <fct>  <fct>      <int>
##  1 Bronze AUS            7
##  2 Bronze AUT          103
##  3 Bronze BEL            7
##  4 Bronze BLR            5
##  5 Bronze BUL            3
##  6 Bronze CAN          107
##  7 Bronze CHN           36
##  8 Bronze CRO            1
##  9 Bronze CZE           35
## 10 Bronze ESP            1
## # ... with 89 more rows

2. Medal Counts adjusted by Population, GDP

I use the Gold medal numbers as the success of a country’s sport career.After calculating the total medals of a country, GDP per Capita and population,I use melt to generate a new variable.Based on this new dataset, I make a geom_line plot and use different color to distinguish the type of variables which could make reader compare among different countries easily. For example, China has very large population but few medals compared to other countries. Slovekia, however, has high score for winter Olymipics,but it is not consistent with its GDP per Capita and small population.

##    Country          value       value
## 1      AUS          Medal    5.000000
## 2      AUT          Medal   79.000000
## 3      BEL          Medal    2.000000
## 4      BLR          Medal    6.000000
## 5      BUL          Medal    1.000000
## 6      CAN          Medal  315.000000
## 7      CHN          Medal   16.000000
## 8      CRO          Medal    4.000000
## 9      CZE          Medal   28.000000
## 10     ESP          Medal    1.000000
## 11     EST          Medal    4.000000
## 12     FIN          Medal   66.000000
## 13     FRA          Medal   36.000000
## 14     GBR          Medal   34.000000
## 15     GER          Medal  218.000000
## 16     ITA          Medal   58.000000
## 17     JPN          Medal   17.000000
## 18     KAZ          Medal    1.000000
## 19     KOR          Medal   51.000000
## 20     LIE          Medal    2.000000
## 21     NED          Medal   42.000000
## 22     NOR          Medal  159.000000
## 23     POL          Medal    6.000000
## 24     RUS          Medal  344.000000
## 25     SLO          Medal    2.000000
## 26     SUI          Medal   76.000000
## 27     SVK          Medal    2.000000
## 28     SWE          Medal  127.000000
## 29     UKR          Medal    5.000000
## 30     USA          Medal  167.000000
## 31     UZB          Medal    1.000000
## 32     AUS GDP.per.Capita  281.554815
## 33     AUT GDP.per.Capita  218.874926
## 34     BEL GDP.per.Capita  201.620139
## 35     BLR GDP.per.Capita   28.702282
## 36     BUL GDP.per.Capita   34.967387
## 37     CAN GDP.per.Capita  216.242650
## 38     CHN GDP.per.Capita   40.138419
## 39     CRO GDP.per.Capita   57.679147
## 40     CZE GDP.per.Capita   87.741691
## 41     ESP GDP.per.Capita  129.157912
## 42     EST GDP.per.Capita   85.592521
## 43     FIN GDP.per.Capita  211.555181
## 44     FRA GDP.per.Capita  181.027841
## 45     GBR GDP.per.Capita  219.379848
## 46     GER GDP.per.Capita  206.566570
## 47     ITA GDP.per.Capita  149.789022
## 48     JPN GDP.per.Capita  162.386076
## 49     KAZ GDP.per.Capita   52.549905
## 50     KOR GDP.per.Capita  136.107620
## 51     LIE GDP.per.Capita          NA
## 52     NED GDP.per.Capita  221.498840
## 53     NOR GDP.per.Capita  372.001849
## 54     POL GDP.per.Capita   62.772738
## 55     RUS GDP.per.Capita   45.462903
## 56     SLO GDP.per.Capita  103.632699
## 57     SUI GDP.per.Capita  404.725396
## 58     SVK GDP.per.Capita   80.441388
## 59     SWE GDP.per.Capita  252.898368
## 60     UKR GDP.per.Capita   10.574774
## 61     USA GDP.per.Capita  280.578592
## 62     UZB GDP.per.Capita   10.660352
## 63     AUS     Population   23.781169
## 64     AUT     Population    8.611088
## 65     BEL     Population   11.285721
## 66     BLR     Population    9.513000
## 67     BUL     Population    7.177991
## 68     CAN     Population   35.851774
## 69     CHN     Population 1371.220000
## 70     CRO     Population    4.224404
## 71     CZE     Population   10.551219
## 72     ESP     Population   46.418269
## 73     EST     Population    1.311998
## 74     FIN     Population    5.482013
## 75     FRA     Population   66.808385
## 76     GBR     Population   65.138232
## 77     GER     Population   81.413145
## 78     ITA     Population   60.802085
## 79     JPN     Population  126.958472
## 80     KAZ     Population   17.544126
## 81     KOR     Population   50.617045
## 82     LIE     Population    0.037531
## 83     NED     Population   16.936520
## 84     NOR     Population    5.195921
## 85     POL     Population   37.999494
## 86     RUS     Population  144.096812
## 87     SLO     Population    2.063768
## 88     SUI     Population    8.286976
## 89     SVK     Population    5.424050
## 90     SWE     Population    9.798871
## 91     UKR     Population   45.198200
## 92     USA     Population  321.418820
## 93     UZB     Population   31.299500

3. Host Country Advantage

I make this plot by clearing the data totally. First, I merge the original data with the new data by “Year”, and include the host country information. Then I generete a new dummy variable(“a”or“b”) through checking whether the host country is consistent with the country which athletes come from. After this, I merged by Year and Country again and use this new dataset to make plot. And to emphasize the difference between country which is a host and country which is not,I use different colors and shapes. I think reader could see it clearly that the host countries usually have more medals than average because the purple dots are distributed at the top of the whole plot.

## Loading required package: xml2
## 
## Attaching package: 'rvest'
## The following object is masked from 'package:purrr':
## 
##     pluck
## The following object is masked from 'package:readr':
## 
##     guess_encoding
## 
## Attaching package: 'reshape'
## The following objects are masked from 'package:tidyr':
## 
##     expand, smiths
## The following object is masked from 'package:dplyr':
## 
##     rename
## The following objects are masked from 'package:reshape2':
## 
##     colsplit, melt, recast
## 
## Attaching package: 'plotly'
## The following object is masked from 'package:reshape':
## 
##     rename
## The following object is masked from 'package:ggplot2':
## 
##     last_plot
## The following object is masked from 'package:stats':
## 
##     filter
## The following object is masked from 'package:graphics':
## 
##     layout
## [1] "integer"
## [1] "character"
## [1] "character"

4. Country success by sport / discipline / event

I choose skiing as the key sport to analyze. I calculate the total number of skiing sport for a country within one year and we could see the data distribution among different countries. And we could tell which country make a great change intervals while others have stable performance for skiing sport on winter Olympic games. For example, readers can see that Norway make great changes during the history of every time olympic game compared with Russia,but Norway still has a higher level of skiing sport than Russia.

##5. Most successful athletes ###I choose the top 30 athletes who won the most gold medals. I use shape to distinguish different sports and different colors to indicate different countries and I facet the plot according to Gender. The reader can easily tell that female athletes’ number of gold medals is less than male athletes and female athletes’ largest number is greater than female athletes’. Male athletes is more talented for Biathlon than female athletes while female athletes is more talented for Ice Hockey.Canada has excellent female athlets while USA has excellent male athletes. ##6. Make two plots interactive ### I make a interactive plot for first point = plot because points which indicate changes through time and country are very dense. Reader can only get basic impression.However, when reader want to get specific information, they need zoom in. And a plotly can meet this need In addition, I choose the boxplot to make the second interactive plot for the skiing sport for different countries of different years. Because I would like to let reader could get specific number of medals,like media, mean and son on. For example, reader want to know the biggest number of Russia’s gold medals for a specific winter game. They can use plotly to get the number easily

## # A tibble: 355 x 3
## # Groups:   Year [?]
##     Year Country sum_medal
##    <int> <fct>       <int>
##  1  1924 AUT             4
##  2  1924 BEL             5
##  3  1924 CAN             9
##  4  1924 FIN            15
##  5  1924 FRA            12
##  6  1924 GBR            25
##  7  1924 NOR            17
##  8  1924 SUI             9
##  9  1924 SWE             9
## 10  1924 USA            13
## # ... with 345 more rows
## # A tibble: 99 x 3
## # Groups:   Medal [?]
##    Medal  Country sum_type
##    <fct>  <fct>      <int>
##  1 Bronze AUS            7
##  2 Bronze AUT          103
##  3 Bronze BEL            7
##  4 Bronze BLR            5
##  5 Bronze BUL            3
##  6 Bronze CAN          107
##  7 Bronze CHN           36
##  8 Bronze CRO            1
##  9 Bronze CZE           35
## 10 Bronze ESP            1
## # ... with 89 more rows
## We recommend that you use the dev version of ggplot2 with `ggplotly()`
## Install it with: `devtools::install_github('tidyverse/ggplot2')`
## We recommend that you use the dev version of ggplot2 with `ggplotly()`
## Install it with: `devtools::install_github('tidyverse/ggplot2')`

#7.Data Table

I would like to built a datatable which include the basic information about its economy and sacles as well as the total number of medals which it won during the history of Winter Olympic games. I intend to show some relations between economic development and sports development. This datatable provide searching function and column filters through which the reader can get access to a country’s total medal number, GDP per capita and population.

## 
## Attaching package: 'data.table'
## The following object is masked from 'package:reshape':
## 
##     melt
## The following object is masked from 'package:purrr':
## 
##     transpose
## The following objects are masked from 'package:dplyr':
## 
##     between, first, last
## The following objects are masked from 'package:reshape2':
## 
##     dcast, melt