Original Source: https://www.kaggle.com/kaggle/the-history-of-baseball/downloads/baseball_2016-03-08-22-23-12.zip
My Source for the player.csv: https://github.com/kylegilde/cuny-r-programming/blob/master/player.csv
## Loading required package: RCurl
## Loading required package: bitops
## player_id birth_year birth_month birth_day birth_country birth_state
## 3 aaronto01 1939 8 5 USA AL
## 7 abadijo01 1854 11 4 USA PA
## 8 abbated01 1877 4 15 USA PA
## 9 abbeybe01 1869 11 11 USA VT
## 10 abbeych01 1866 10 14 USA NE
## 11 abbotda01 1862 3 16 USA OH
## birth_city death_year death_month death_day death_country death_state
## 3 Mobile 1984 8 16 USA GA
## 7 Philadelphia 1905 5 17 USA NJ
## 8 Latrobe 1957 1 6 USA FL
## 9 Essex 1962 6 11 USA VT
## 10 Falls City 1926 4 27 USA CA
## 11 Portage 1930 2 13 USA MI
## death_city name_first name_last name_given weight height
## 3 Atlanta Tommie Aaron Tommie Lee 190 75
## 7 Pemberton John Abadie John W. 192 72
## 8 Fort Lauderdale Ed Abbaticchio Edward James 170 71
## 9 Colchester Bert Abbey Bert Wood 175 71
## 10 San Francisco Charlie Abbey Charles S. 169 68
## 11 Ottawa Lake Dan Abbott Leander Franklin 190 71
## bats throws debut final_game retro_id bbref_id
## 3 R R 1962-04-10 1971-09-26 aarot101 aaronto01
## 7 R R 1875-04-26 1875-06-10 abadj101 abadijo01
## 8 R R 1897-09-04 1910-09-15 abbae101 abbated01
## 9 R R 1892-06-14 1896-09-23 abbeb101 abbeybe01
## 10 L L 1893-08-16 1897-08-19 abbec101 abbeych01
## 11 R R 1890-04-19 1890-05-23 abbod101 abbotda01
## 'data.frame': 8392 obs. of 24 variables:
## $ player_id : Factor w/ 18846 levels "aardsda01","aaronha01",..: 3 7 8 9 10 11 12 18 20 23 ...
## $ birth_year : num 1939 1854 1877 1869 1866 ...
## $ birth_month : num 8 11 4 11 10 3 10 9 7 1 ...
## $ birth_day : num 5 4 15 11 14 16 22 5 31 30 ...
## $ birth_country: Factor w/ 53 levels "","Afghanistan",..: 50 50 50 50 50 50 50 50 50 50 ...
## $ birth_state : Factor w/ 246 levels "","AB","Aberdeen",..: 6 173 173 229 148 165 165 173 165 30 ...
## $ birth_city : Factor w/ 4714 levels "","Aberdeen",..: 2718 3279 2291 1337 1382 3380 4347 2896 838 4223 ...
## $ death_year : num 1984 1905 1957 1962 1926 ...
## $ death_month : num 8 5 1 6 4 2 6 4 5 2 ...
## $ death_day : num 16 17 6 11 27 13 11 13 20 19 ...
## $ death_country: Factor w/ 24 levels "","American Samoa",..: 22 22 22 22 22 22 22 22 22 22 ...
## $ death_state : Factor w/ 93 levels "","AB","AK","AL",..: 26 57 25 88 12 44 12 21 63 12 ...
## $ death_city : Factor w/ 2554 levels "","Aberdeen",..: 91 1735 758 462 1984 1672 1259 2380 813 2548 ...
## $ name_first : Factor w/ 2313 levels "","A. J.","Aaron",..: 2087 1148 649 158 335 482 806 1575 26 168 ...
## $ name_last : Factor w/ 9713 levels "Aardsma","Aaron",..: 2 5 6 7 7 8 8 8 9 11 ...
## $ name_given : Factor w/ 12437 levels "","A. Harry",..: 11411 6605 3201 954 1714 7564 4885 8907 177 12058 ...
## $ weight : num 190 192 170 175 169 190 180 180 195 190 ...
## $ height : num 75 72 71 71 68 71 70 74 74 70 ...
## $ bats : Factor w/ 4 levels "","B","L","R": 4 4 4 4 3 4 4 4 3 4 ...
## $ throws : Factor w/ 3 levels "","L","R": 3 3 3 3 2 3 3 3 2 3 ...
## $ debut : Factor w/ 10037 levels "","1871-05-04",..: 5145 106 1132 875 934 722 1433 1947 4531 4647 ...
## $ final_game : Factor w/ 9029 levels "","1871-05-05",..: 5560 103 1851 1078 1110 684 1523 1868 4762 4497 ...
## $ retro_id : Factor w/ 18793 levels "","aardd001",..: 4 8 9 10 11 12 13 19 21 23 ...
## $ bbref_id : Factor w/ 18846 levels "","aardsda01",..: 4 8 9 10 11 12 13 19 21 24 ...
## - attr(*, "na.action")=Class 'omit' Named int [1:10454] 1 2 4 5 6 13 14 15 16 17 ...
## .. ..- attr(*, "names")= chr [1:10454] "1" "2" "4" "5" ...
## Loading required package: ggplot2
## Loading required package: ggthemes
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
There appears to be a mild correlation between a player’s height and weight. This relationship persists whether the player bats right-handed, left-handed or both.