This project takes 3 different data sets and transforms them into a more workable structure, which is then written to CSVs. I’ve also completed lighweight analysis of each data set, to answer an initial driving question or hypothesis. Included datasets are:
## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.0 ──
## ✓ ggplot2 3.3.3 ✓ purrr 0.3.4
## ✓ tibble 3.0.4 ✓ dplyr 1.0.2
## ✓ tidyr 1.1.2 ✓ stringr 1.4.0
## ✓ readr 1.4.0 ✓ forcats 0.5.1
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## x dplyr::filter() masks stats::filter()
## x dplyr::lag() masks stats::lag()
## Loading required package: xml2
##
## Attaching package: 'rvest'
## The following object is masked from 'package:purrr':
##
## pluck
## The following object is masked from 'package:readr':
##
## guess_encoding
##
## Attaching package: 'jsonlite'
## The following object is masked from 'package:purrr':
##
## flatten
##
## Attaching package: 'RCurl'
## The following object is masked from 'package:tidyr':
##
## complete
IMDB (Internet Movie Database) creates rankings of movies, presumably based on user scores. The code below reads in data from one such ranking: all time 250 movies.
Reviewing this table initially, I wanted to understand if and how age of a film played into the rank, with the hypothesis being older movies fared better, as they are more likely to accrue strong rating over time.
Citation: “IMDb Top Rated Movies.” IMDb, IMDb.com, www.imdb.com/chart/top/?ref_=nv_mv_250.
## {html_document}
## <html xmlns:og="http://ogp.me/ns#" xmlns:fb="http://www.facebook.com/2008/fbml">
## [1] <head>\n<meta http-equiv="Content-Type" content="text/html; charset=UTF-8 ...
## [2] <body id="styleguide-v2" class="fixed">\n <img height="1" widt ...
## [1] "\n 1.\n The Shawshank Redemption\n (1994)\n "
## [2] "\n 2.\n The Godfather\n (1972)\n "
## [3] "\n 3.\n The Godfather: Part II\n (1974)\n "
## [4] "\n 4.\n The Dark Knight\n (2008)\n "
## [5] "\n 5.\n 12 Angry Men\n (1957)\n "
## [6] "\n 6.\n Schindler's List\n (1993)\n "
## [7] "\n 7.\n The Lord of the Rings: The Return of the King\n (2003)\n "
## [8] "\n 8.\n Pulp Fiction\n (1994)\n "
## [9] "\n 9.\n The Good, the Bad and the Ugly\n (1966)\n "
## [10] "\n 10.\n The Lord of the Rings: The Fellowship of the Ring\n (2001)\n "
## [11] "\n 11.\n Fight Club\n (1999)\n "
## [12] "\n 12.\n Forrest Gump\n (1994)\n "
## [13] "\n 13.\n Inception\n (2010)\n "
## [14] "\n 14.\n The Lord of the Rings: The Two Towers\n (2002)\n "
## [15] "\n 15.\n Star Wars: Episode V - The Empire Strikes Back\n (1980)\n "
## [16] "\n 16.\n The Matrix\n (1999)\n "
## [17] "\n 17.\n Goodfellas\n (1990)\n "
## [18] "\n 18.\n One Flew Over the Cuckoo's Nest\n (1975)\n "
## [19] "\n 19.\n Seven Samurai\n (1954)\n "
## [20] "\n 20.\n Se7en\n (1995)\n "
## [21] "\n 21.\n Life Is Beautiful\n (1997)\n "
## [22] "\n 22.\n City of God\n (2002)\n "
## [23] "\n 23.\n The Silence of the Lambs\n (1991)\n "
## [24] "\n 24.\n It's a Wonderful Life\n (1946)\n "
## [25] "\n 25.\n Star Wars: Episode IV - A New Hope\n (1977)\n "
## [26] "\n 26.\n Saving Private Ryan\n (1998)\n "
## [27] "\n 27.\n The Green Mile\n (1999)\n "
## [28] "\n 28.\n Spirited Away\n (2001)\n "
## [29] "\n 29.\n Interstellar\n (2014)\n "
## [30] "\n 30.\n Parasite\n (2019)\n "
## [31] "\n 31.\n Léon: The Professional\n (1994)\n "
## [32] "\n 32.\n Hara-Kiri\n (1962)\n "
## [33] "\n 33.\n The Usual Suspects\n (1995)\n "
## [34] "\n 34.\n The Lion King\n (1994)\n "
## [35] "\n 35.\n The Pianist\n (2002)\n "
## [36] "\n 36.\n Back to the Future\n (1985)\n "
## [37] "\n 37.\n Terminator 2: Judgment Day\n (1991)\n "
## [38] "\n 38.\n American History X\n (1998)\n "
## [39] "\n 39.\n Modern Times\n (1936)\n "
## [40] "\n 40.\n Gladiator\n (2000)\n "
## [41] "\n 41.\n Psycho\n (1960)\n "
## [42] "\n 42.\n The Departed\n (2006)\n "
## [43] "\n 43.\n City Lights\n (1931)\n "
## [44] "\n 44.\n The Intouchables\n (2011)\n "
## [45] "\n 45.\n Whiplash\n (2014)\n "
## [46] "\n 46.\n Grave of the Fireflies\n (1988)\n "
## [47] "\n 47.\n The Prestige\n (2006)\n "
## [48] "\n 48.\n Once Upon a Time in the West\n (1968)\n "
## [49] "\n 49.\n Casablanca\n (1942)\n "
## [50] "\n 50.\n Cinema Paradiso\n (1988)\n "
## [51] "\n 51.\n Rear Window\n (1954)\n "
## [52] "\n 52.\n Alien\n (1979)\n "
## [53] "\n 53.\n Apocalypse Now\n (1979)\n "
## [54] "\n 54.\n Memento\n (2000)\n "
## [55] "\n 55.\n The Great Dictator\n (1940)\n "
## [56] "\n 56.\n Raiders of the Lost Ark\n (1981)\n "
## [57] "\n 57.\n Django Unchained\n (2012)\n "
## [58] "\n 58.\n The Lives of Others\n (2006)\n "
## [59] "\n 59.\n Hamilton\n (2020)\n "
## [60] "\n 60.\n Paths of Glory\n (1957)\n "
## [61] "\n 61.\n Joker\n (2019)\n "
## [62] "\n 62.\n WALL·E\n (2008)\n "
## [63] "\n 63.\n The Shining\n (1980)\n "
## [64] "\n 64.\n Avengers: Infinity War\n (2018)\n "
## [65] "\n 65.\n Sunset Blvd.\n (1950)\n "
## [66] "\n 66.\n Witness for the Prosecution\n (1957)\n "
## [67] "\n 67.\n Oldboy\n (2003)\n "
## [68] "\n 68.\n Spider-Man: Into the Spider-Verse\n (2018)\n "
## [69] "\n 69.\n Princess Mononoke\n (1997)\n "
## [70] "\n 70.\n Dr. Strangelove or: How I Learned to Stop Worrying and Love the Bomb\n (1964)\n "
## [71] "\n 71.\n The Dark Knight Rises\n (2012)\n "
## [72] "\n 72.\n Once Upon a Time in America\n (1984)\n "
## [73] "\n 73.\n Your Name.\n (2016)\n "
## [74] "\n 74.\n Aliens\n (1986)\n "
## [75] "\n 75.\n Coco\n (2017)\n "
## [76] "\n 76.\n Avengers: Endgame\n (2019)\n "
## [77] "\n 77.\n Capharnaüm\n (2018)\n "
## [78] "\n 78.\n American Beauty\n (1999)\n "
## [79] "\n 79.\n Braveheart\n (1995)\n "
## [80] "\n 80.\n Das Boot\n (1981)\n "
## [81] "\n 81.\n High and Low\n (1963)\n "
## [82] "\n 82.\n Toy Story\n (1995)\n "
## [83] "\n 83.\n 3 Idiots\n (2009)\n "
## [84] "\n 84.\n Amadeus\n (1984)\n "
## [85] "\n 85.\n Inglourious Basterds\n (2009)\n "
## [86] "\n 86.\n Good Will Hunting\n (1997)\n "
## [87] "\n 87.\n Star Wars: Episode VI - Return of the Jedi\n (1983)\n "
## [88] "\n 88.\n Taare Zameen Par\n (2007)\n "
## [89] "\n 89.\n Reservoir Dogs\n (1992)\n "
## [90] "\n 90.\n 2001: A Space Odyssey\n (1968)\n "
## [91] "\n 91.\n Requiem for a Dream\n (2000)\n "
## [92] "\n 92.\n The Hunt\n (2012)\n "
## [93] "\n 93.\n Vertigo\n (1958)\n "
## [94] "\n 94.\n M\n (1931)\n "
## [95] "\n 95.\n Eternal Sunshine of the Spotless Mind\n (2004)\n "
## [96] "\n 96.\n Citizen Kane\n (1941)\n "
## [97] "\n 97.\n Dangal\n (2016)\n "
## [98] "\n 98.\n Singin' in the Rain\n (1952)\n "
## [99] "\n 99.\n Bicycle Thieves\n (1948)\n "
## [100] "\n 100.\n The Kid\n (1921)\n "
## [101] "\n 101.\n Full Metal Jacket\n (1987)\n "
## [102] "\n 102.\n Come and See\n (1985)\n "
## [103] "\n 103.\n Snatch\n (2000)\n "
## [104] "\n 104.\n North by Northwest\n (1959)\n "
## [105] "\n 105.\n Ikiru\n (1952)\n "
## [106] "\n 106.\n A Clockwork Orange\n (1971)\n "
## [107] "\n 107.\n Scarface\n (1983)\n "
## [108] "\n 108.\n 1917\n (2019)\n "
## [109] "\n 109.\n Taxi Driver\n (1976)\n "
## [110] "\n 110.\n Incendies\n (2010)\n "
## [111] "\n 111.\n A Separation\n (2011)\n "
## [112] "\n 112.\n Toy Story 3\n (2010)\n "
## [113] "\n 113.\n The Sting\n (1973)\n "
## [114] "\n 114.\n Lawrence of Arabia\n (1962)\n "
## [115] "\n 115.\n Amélie\n (2001)\n "
## [116] "\n 116.\n Metropolis\n (1927)\n "
## [117] "\n 117.\n The Apartment\n (1960)\n "
## [118] "\n 118.\n For a Few Dollars More\n (1965)\n "
## [119] "\n 119.\n Double Indemnity\n (1944)\n "
## [120] "\n 120.\n To Kill a Mockingbird\n (1962)\n "
## [121] "\n 121.\n Up\n (2009)\n "
## [122] "\n 122.\n Indiana Jones and the Last Crusade\n (1989)\n "
## [123] "\n 123.\n Heat\n (1995)\n "
## [124] "\n 124.\n L.A. Confidential\n (1997)\n "
## [125] "\n 125.\n Green Book\n (2018)\n "
## [126] "\n 126.\n Die Hard\n (1988)\n "
## [127] "\n 127.\n Monty Python and the Holy Grail\n (1975)\n "
## [128] "\n 128.\n Batman Begins\n (2005)\n "
## [129] "\n 129.\n Yojimbo\n (1961)\n "
## [130] "\n 130.\n Rashômon\n (1950)\n "
## [131] "\n 131.\n Downfall\n (2004)\n "
## [132] "\n 132.\n Children of Heaven\n (1997)\n "
## [133] "\n 133.\n Unforgiven\n (1992)\n "
## [134] "\n 134.\n Ran\n (1985)\n "
## [135] "\n 135.\n Some Like It Hot\n (1959)\n "
## [136] "\n 136.\n Howl's Moving Castle\n (2004)\n "
## [137] "\n 137.\n All About Eve\n (1950)\n "
## [138] "\n 138.\n Casino\n (1995)\n "
## [139] "\n 139.\n A Beautiful Mind\n (2001)\n "
## [140] "\n 140.\n The Wolf of Wall Street\n (2013)\n "
## [141] "\n 141.\n The Great Escape\n (1963)\n "
## [142] "\n 142.\n Pan's Labyrinth\n (2006)\n "
## [143] "\n 143.\n There Will Be Blood\n (2007)\n "
## [144] "\n 144.\n The Secret in Their Eyes\n (2009)\n "
## [145] "\n 145.\n Lock, Stock and Two Smoking Barrels\n (1998)\n "
## [146] "\n 146.\n Judgment at Nuremberg\n (1961)\n "
## [147] "\n 147.\n My Neighbor Totoro\n (1988)\n "
## [148] "\n 148.\n Raging Bull\n (1980)\n "
## [149] "\n 149.\n The Treasure of the Sierra Madre\n (1948)\n "
## [150] "\n 150.\n Dial M for Murder\n (1954)\n "
## [151] "\n 151.\n Three Billboards Outside Ebbing, Missouri\n (2017)\n "
## [152] "\n 152.\n Shutter Island\n (2010)\n "
## [153] "\n 153.\n Chinatown\n (1974)\n "
## [154] "\n 154.\n The Gold Rush\n (1925)\n "
## [155] "\n 155.\n Babam ve Oglum\n (2005)\n "
## [156] "\n 156.\n No Country for Old Men\n (2007)\n "
## [157] "\n 157.\n V for Vendetta\n (2005)\n "
## [158] "\n 158.\n Inside Out\n (2015)\n "
## [159] "\n 159.\n The Thing\n (1982)\n "
## [160] "\n 160.\n The Elephant Man\n (1980)\n "
## [161] "\n 161.\n The Seventh Seal\n (1957)\n "
## [162] "\n 162.\n Warrior\n (2011)\n "
## [163] "\n 163.\n The Sixth Sense\n (1999)\n "
## [164] "\n 164.\n Jurassic Park\n (1993)\n "
## [165] "\n 165.\n Klaus\n (2019)\n "
## [166] "\n 166.\n Trainspotting\n (1996)\n "
## [167] "\n 167.\n The Truman Show\n (1998)\n "
## [168] "\n 168.\n Gone with the Wind\n (1939)\n "
## [169] "\n 169.\n Finding Nemo\n (2003)\n "
## [170] "\n 170.\n Stalker\n (1979)\n "
## [171] "\n 171.\n Wild Strawberries\n (1957)\n "
## [172] "\n 172.\n Kill Bill: Vol. 1\n (2003)\n "
## [173] "\n 173.\n Memories of Murder\n (2003)\n "
## [174] "\n 174.\n Blade Runner\n (1982)\n "
## [175] "\n 175.\n The Bridge on the River Kwai\n (1957)\n "
## [176] "\n 176.\n Fargo\n (1996)\n "
## [177] "\n 177.\n Room\n (2015)\n "
## [178] "\n 178.\n Wild Tales\n (2014)\n "
## [179] "\n 179.\n Gran Torino\n (2008)\n "
## [180] "\n 180.\n Tokyo Story\n (1953)\n "
## [181] "\n 181.\n The Third Man\n (1949)\n "
## [182] "\n 182.\n On the Waterfront\n (1954)\n "
## [183] "\n 183.\n The Deer Hunter\n (1978)\n "
## [184] "\n 184.\n In the Name of the Father\n (1993)\n "
## [185] "\n 185.\n Mary and Max\n (2009)\n "
## [186] "\n 186.\n The Grand Budapest Hotel\n (2014)\n "
## [187] "\n 187.\n Before Sunrise\n (1995)\n "
## [188] "\n 188.\n Gone Girl\n (2014)\n "
## [189] "\n 189.\n Catch Me If You Can\n (2002)\n "
## [190] "\n 190.\n Hacksaw Ridge\n (2016)\n "
## [191] "\n 191.\n Prisoners\n (2013)\n "
## [192] "\n 192.\n Persona\n (1966)\n "
## [193] "\n 193.\n Andhadhun\n (2018)\n "
## [194] "\n 194.\n Sherlock Jr.\n (1924)\n "
## [195] "\n 195.\n The Big Lebowski\n (1998)\n "
## [196] "\n 196.\n To Be or Not to Be\n (1942)\n "
## [197] "\n 197.\n Barry Lyndon\n (1975)\n "
## [198] "\n 198.\n The General\n (1926)\n "
## [199] "\n 199.\n How to Train Your Dragon\n (2010)\n "
## [200] "\n 200.\n Ford v Ferrari\n (2019)\n "
## [201] "\n 201.\n Autumn Sonata\n (1978)\n "
## [202] "\n 202.\n 12 Years a Slave\n (2013)\n "
## [203] "\n 203.\n The Bandit\n (1996)\n "
## [204] "\n 204.\n Anand\n (1971)\n "
## [205] "\n 205.\n Mr. Smith Goes to Washington\n (1939)\n "
## [206] "\n 206.\n Mad Max: Fury Road\n (2015)\n "
## [207] "\n 207.\n Raatchasan\n (2018)\n "
## [208] "\n 208.\n Dead Poets Society\n (1989)\n "
## [209] "\n 209.\n Million Dollar Baby\n (2004)\n "
## [210] "\n 210.\n Stand by Me\n (1986)\n "
## [211] "\n 211.\n Harry Potter and the Deathly Hallows: Part 2\n (2011)\n "
## [212] "\n 212.\n Network\n (1976)\n "
## [213] "\n 213.\n Ben-Hur\n (1959)\n "
## [214] "\n 214.\n Hachi: A Dog's Tale\n (2009)\n "
## [215] "\n 215.\n Cool Hand Luke\n (1967)\n "
## [216] "\n 216.\n The Handmaiden\n (2016)\n "
## [217] "\n 217.\n Logan\n (2017)\n "
## [218] "\n 218.\n Platoon\n (1986)\n "
## [219] "\n 219.\n Into the Wild\n (2007)\n "
## [220] "\n 220.\n Rush\n (2013)\n "
## [221] "\n 221.\n The Wages of Fear\n (1953)\n "
## [222] "\n 222.\n Soul\n (2020)\n "
## [223] "\n 223.\n Monty Python's Life of Brian\n (1979)\n "
## [224] "\n 224.\n La Haine\n (1995)\n "
## [225] "\n 225.\n The 400 Blows\n (1959)\n "
## [226] "\n 226.\n The Passion of Joan of Arc\n (1928)\n "
## [227] "\n 227.\n Spotlight\n (2015)\n "
## [228] "\n 228.\n Hotel Rwanda\n (2004)\n "
## [229] "\n 229.\n Amores Perros\n (2000)\n "
## [230] "\n 230.\n Gangs of Wasseypur\n (2012)\n "
## [231] "\n 231.\n Andrei Rublev\n (1966)\n "
## [232] "\n 232.\n Monsters, Inc.\n (2001)\n "
## [233] "\n 233.\n Rocky\n (1976)\n "
## [234] "\n 234.\n Nausicaä of the Valley of the Wind\n (1984)\n "
## [235] "\n 235.\n Rebecca\n (1940)\n "
## [236] "\n 236.\n Time of the Gypsies\n (1988)\n "
## [237] "\n 237.\n Before Sunset\n (2004)\n "
## [238] "\n 238.\n In the Mood for Love\n (2000)\n "
## [239] "\n 239.\n Rififi\n (1955)\n "
## [240] "\n 240.\n Rang De Basanti\n (2006)\n "
## [241] "\n 241.\n Paris, Texas\n (1984)\n "
## [242] "\n 242.\n Drishyam\n (2013)\n "
## [243] "\n 243.\n Portrait of a Lady on Fire\n (2019)\n "
## [244] "\n 244.\n It Happened One Night\n (1934)\n "
## [245] "\n 245.\n The Invisible Guest\n (2016)\n "
## [246] "\n 246.\n A Silent Voice: The Movie\n (2016)\n "
## [247] "\n 247.\n Three Colors: Red\n (1994)\n "
## [248] "\n 248.\n The Battle of Algiers\n (1966)\n "
## [249] "\n 249.\n Neon Genesis Evangelion: The End of Evangelion\n (1997)\n "
## [250] "\n 250.\n The Help\n (2011)\n "
## [1] "\n 9.2\n "
## [2] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [3] "\n 9.1\n "
## [4] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [5] "\n 9.0\n "
## [6] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [7] "\n 9.0\n "
## [8] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [9] "\n 8.9\n "
## [10] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [11] "\n 8.9\n "
## [12] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [13] "\n 8.9\n "
## [14] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [15] "\n 8.8\n "
## [16] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [17] "\n 8.8\n "
## [18] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [19] "\n 8.8\n "
## [20] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [21] "\n 8.8\n "
## [22] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [23] "\n 8.8\n "
## [24] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [25] "\n 8.7\n "
## [26] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [27] "\n 8.7\n "
## [28] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [29] "\n 8.7\n "
## [30] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [31] "\n 8.6\n "
## [32] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [33] "\n 8.6\n "
## [34] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [35] "\n 8.6\n "
## [36] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [37] "\n 8.6\n "
## [38] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [39] "\n 8.6\n "
## [40] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [41] "\n 8.6\n "
## [42] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [43] "\n 8.6\n "
## [44] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [45] "\n 8.6\n "
## [46] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [47] "\n 8.6\n "
## [48] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [49] "\n 8.6\n "
## [50] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [51] "\n 8.6\n "
## [52] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [53] "\n 8.5\n "
## [54] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [55] "\n 8.5\n "
## [56] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [57] "\n 8.5\n "
## [58] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [59] "\n 8.5\n "
## [60] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [61] "\n 8.5\n "
## [62] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [63] "\n 8.5\n "
## [64] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [65] "\n 8.5\n "
## [66] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [67] "\n 8.5\n "
## [68] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [69] "\n 8.5\n "
## [70] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [71] "\n 8.5\n "
## [72] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [73] "\n 8.5\n "
## [74] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [75] "\n 8.5\n "
## [76] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [77] "\n 8.5\n "
## [78] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [79] "\n 8.5\n "
## [80] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [81] "\n 8.5\n "
## [82] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [83] "\n 8.5\n "
## [84] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [85] "\n 8.5\n "
## [86] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [87] "\n 8.5\n "
## [88] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [89] "\n 8.5\n "
## [90] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [91] "\n 8.5\n "
## [92] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [93] "\n 8.5\n "
## [94] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [95] "\n 8.4\n "
## [96] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [97] "\n 8.4\n "
## [98] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [99] "\n 8.4\n "
## [100] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [101] "\n 8.4\n "
## [102] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [103] "\n 8.4\n "
## [104] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [105] "\n 8.4\n "
## [106] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [107] "\n 8.4\n "
## [108] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [109] "\n 8.4\n "
## [110] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [111] "\n 8.4\n "
## [112] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [113] "\n 8.4\n "
## [114] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [115] "\n 8.4\n "
## [116] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [117] "\n 8.4\n "
## [118] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [119] "\n 8.4\n "
## [120] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [121] "\n 8.4\n "
## [122] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [123] "\n 8.4\n "
## [124] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [125] "\n 8.4\n "
## [126] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [127] "\n 8.4\n "
## [128] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [129] "\n 8.4\n "
## [130] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [131] "\n 8.4\n "
## [132] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [133] "\n 8.3\n "
## [134] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [135] "\n 8.3\n "
## [136] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [137] "\n 8.3\n "
## [138] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [139] "\n 8.3\n "
## [140] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [141] "\n 8.3\n "
## [142] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [143] "\n 8.3\n "
## [144] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [145] "\n 8.3\n "
## [146] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [147] "\n 8.3\n "
## [148] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [149] "\n 8.3\n "
## [150] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [151] "\n 8.3\n "
## [152] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [153] "\n 8.3\n "
## [154] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [155] "\n 8.3\n "
## [156] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [157] "\n 8.3\n "
## [158] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [159] "\n 8.3\n "
## [160] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [161] "\n 8.3\n "
## [162] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [163] "\n 8.3\n "
## [164] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [165] "\n 8.3\n "
## [166] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [167] "\n 8.3\n "
## [168] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [169] "\n 8.3\n "
## [170] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [171] "\n 8.3\n "
## [172] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [173] "\n 8.3\n "
## [174] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [175] "\n 8.3\n "
## [176] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [177] "\n 8.3\n "
## [178] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [179] "\n 8.3\n "
## [180] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [181] "\n 8.3\n "
## [182] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [183] "\n 8.3\n "
## [184] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [185] "\n 8.3\n "
## [186] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [187] "\n 8.3\n "
## [188] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [189] "\n 8.3\n "
## [190] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [191] "\n 8.3\n "
## [192] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [193] "\n 8.3\n "
## [194] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [195] "\n 8.2\n "
## [196] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [197] "\n 8.2\n "
## [198] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [199] "\n 8.2\n "
## [200] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [201] "\n 8.2\n "
## [202] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [203] "\n 8.2\n "
## [204] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [205] "\n 8.2\n "
## [206] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [207] "\n 8.2\n "
## [208] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [209] "\n 8.2\n "
## [210] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [211] "\n 8.2\n "
## [212] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [213] "\n 8.2\n "
## [214] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [215] "\n 8.2\n "
## [216] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [217] "\n 8.2\n "
## [218] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [219] "\n 8.2\n "
## [220] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [221] "\n 8.2\n "
## [222] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [223] "\n 8.2\n "
## [224] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [225] "\n 8.2\n "
## [226] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [227] "\n 8.2\n "
## [228] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [229] "\n 8.2\n "
## [230] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [231] "\n 8.2\n "
## [232] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [233] "\n 8.2\n "
## [234] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [235] "\n 8.2\n "
## [236] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [237] "\n 8.2\n "
## [238] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [239] "\n 8.2\n "
## [240] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [241] "\n 8.2\n "
## [242] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [243] "\n 8.2\n "
## [244] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [245] "\n 8.2\n "
## [246] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [247] "\n 8.2\n "
## [248] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [249] "\n 8.2\n "
## [250] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [251] "\n 8.2\n "
## [252] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [253] "\n 8.2\n "
## [254] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [255] "\n 8.2\n "
## [256] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [257] "\n 8.2\n "
## [258] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [259] "\n 8.2\n "
## [260] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [261] "\n 8.2\n "
## [262] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [263] "\n 8.2\n "
## [264] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [265] "\n 8.2\n "
## [266] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [267] "\n 8.2\n "
## [268] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [269] "\n 8.2\n "
## [270] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [271] "\n 8.2\n "
## [272] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [273] "\n 8.2\n "
## [274] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [275] "\n 8.2\n "
## [276] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [277] "\n 8.2\n "
## [278] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [279] "\n 8.2\n "
## [280] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [281] "\n 8.2\n "
## [282] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [283] "\n 8.2\n "
## [284] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [285] "\n 8.1\n "
## [286] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [287] "\n 8.1\n "
## [288] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [289] "\n 8.1\n "
## [290] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [291] "\n 8.1\n "
## [292] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [293] "\n 8.1\n "
## [294] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [295] "\n 8.1\n "
## [296] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [297] "\n 8.1\n "
## [298] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [299] "\n 8.1\n "
## [300] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [301] "\n 8.1\n "
## [302] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [303] "\n 8.1\n "
## [304] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [305] "\n 8.1\n "
## [306] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [307] "\n 8.1\n "
## [308] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [309] "\n 8.1\n "
## [310] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [311] "\n 8.1\n "
## [312] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [313] "\n 8.1\n "
## [314] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [315] "\n 8.1\n "
## [316] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [317] "\n 8.1\n "
## [318] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [319] "\n 8.1\n "
## [320] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [321] "\n 8.1\n "
## [322] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [323] "\n 8.1\n "
## [324] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [325] "\n 8.1\n "
## [326] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [327] "\n 8.1\n "
## [328] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [329] "\n 8.1\n "
## [330] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [331] "\n 8.1\n "
## [332] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [333] "\n 8.1\n "
## [334] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [335] "\n 8.1\n "
## [336] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [337] "\n 8.1\n "
## [338] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [339] "\n 8.1\n "
## [340] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [341] "\n 8.1\n "
## [342] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [343] "\n 8.1\n "
## [344] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [345] "\n 8.1\n "
## [346] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [347] "\n 8.1\n "
## [348] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [349] "\n 8.1\n "
## [350] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [351] "\n 8.1\n "
## [352] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [353] "\n 8.1\n "
## [354] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [355] "\n 8.1\n "
## [356] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [357] "\n 8.1\n "
## [358] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [359] "\n 8.1\n "
## [360] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [361] "\n 8.1\n "
## [362] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [363] "\n 8.1\n "
## [364] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [365] "\n 8.1\n "
## [366] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [367] "\n 8.1\n "
## [368] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [369] "\n 8.1\n "
## [370] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [371] "\n 8.1\n "
## [372] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [373] "\n 8.1\n "
## [374] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [375] "\n 8.1\n "
## [376] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [377] "\n 8.1\n "
## [378] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [379] "\n 8.1\n "
## [380] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [381] "\n 8.1\n "
## [382] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [383] "\n 8.1\n "
## [384] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [385] "\n 8.1\n "
## [386] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [387] "\n 8.1\n "
## [388] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [389] "\n 8.1\n "
## [390] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [391] "\n 8.1\n "
## [392] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [393] "\n 8.1\n "
## [394] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [395] "\n 8.1\n "
## [396] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [397] "\n 8.1\n "
## [398] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [399] "\n 8.1\n "
## [400] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [401] "\n 8.1\n "
## [402] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [403] "\n 8.1\n "
## [404] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [405] "\n 8.1\n "
## [406] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [407] "\n 8.1\n "
## [408] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [409] "\n 8.1\n "
## [410] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [411] "\n 8.1\n "
## [412] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [413] "\n 8.1\n "
## [414] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [415] "\n 8.1\n "
## [416] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [417] "\n 8.1\n "
## [418] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [419] "\n 8.1\n "
## [420] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [421] "\n 8.1\n "
## [422] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [423] "\n 8.1\n "
## [424] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [425] "\n 8.1\n "
## [426] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [427] "\n 8.1\n "
## [428] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [429] "\n 8.1\n "
## [430] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [431] "\n 8.1\n "
## [432] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [433] "\n 8.1\n "
## [434] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [435] "\n 8.0\n "
## [436] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [437] "\n 8.0\n "
## [438] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [439] "\n 8.0\n "
## [440] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [441] "\n 8.0\n "
## [442] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [443] "\n 8.0\n "
## [444] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [445] "\n 8.0\n "
## [446] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [447] "\n 8.0\n "
## [448] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [449] "\n 8.0\n "
## [450] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [451] "\n 8.0\n "
## [452] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [453] "\n 8.0\n "
## [454] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [455] "\n 8.0\n "
## [456] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [457] "\n 8.0\n "
## [458] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [459] "\n 8.0\n "
## [460] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [461] "\n 8.0\n "
## [462] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [463] "\n 8.0\n "
## [464] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [465] "\n 8.0\n "
## [466] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [467] "\n 8.0\n "
## [468] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [469] "\n 8.0\n "
## [470] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [471] "\n 8.0\n "
## [472] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [473] "\n 8.0\n "
## [474] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [475] "\n 8.0\n "
## [476] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [477] "\n 8.0\n "
## [478] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [479] "\n 8.0\n "
## [480] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [481] "\n 8.0\n "
## [482] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [483] "\n 8.0\n "
## [484] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [485] "\n 8.0\n "
## [486] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [487] "\n 8.0\n "
## [488] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [489] "\n 8.0\n "
## [490] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [491] "\n 8.0\n "
## [492] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [493] "\n 8.0\n "
## [494] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [495] "\n 8.0\n "
## [496] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [497] "\n 8.0\n "
## [498] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## [499] "\n 8.0\n "
## [500] "\n \n \n \n 12345678910\n \n \n \n NOT YET RELEASED\n \n \n Seen\n \n \n "
## Warning: Expected 2 pieces. Additional pieces discarded in 8 rows [65, 70, 73,
## 124, 172, 194, 205, 232].
Now we can compare age of movie, rank, and rating. I had a hypothesis there may be some reverse recency bias, where older movies overall ranked higher. Doing a quick scatterplot, there’s not a clear overwhelming trend.
ggplot(data=movie_data,aes(rank,year)) + geom_point(aes(color=rating))
Let’s make this data a little more decipherable by adding some buckets around year and rank.
# to create some buckets, I'm adding a 'decade' field
movie_data <- movie_data %>%
mutate(decade = year - (year %% 10))
# and bucketing ranking into groups of 50
movie_data <- movie_data %>%
mutate(rank_50 = rank - (rank %% 50))
We can compare the distribution of decades in histograms. From this we can see that the top tier (0 bucket) has a higher number of movies in the 90s and 2000s, suggesting there isn’t reverse recency bias, but perhaps a true recency bias.
ggplot(data = movie_data, aes(x = decade)) +
geom_histogram() + facet_wrap(~rank_50)
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
Joseph Connolly shared this:
On the City of New York’s website, there is a fantastic dataset about squirrels in Central Park. From this, I can tidy it to perform an analysis on how squirrels in the big city live. From here, I can break down the demographics and possibly mimic the 2019 Squirrel Report with an October 2018 version, as the dataset indicates these observations took place during that time.
The data is available for download in CSV format, which I’ve made available on Github.
Citation: “2018 Central Park Squirrel Census - Squirrel Data.” NYC Open Data, data.cityofnewyork.us/Environment/2018-Central-Park-Squirrel-Census-Squirrel-Data/vfnx-vebw/data.
First, we read in the CSV.
x <- getURL("http://raw.githubusercontent.com/cmm6/data607-project2/main/squirrel_census.csv",.opts=curlOptions(followlocation = TRUE))
squirrels <- read.csv(text = x, header=TRUE)
Then we can begin constructing a more useful dataset. First, I want to drop some columns without clear value and focus on Squirrel demographics per the prompt. I’m also curious about Squirrel actions, so I’ll include that separately.
head(squirrels)
## X Y Unique.Squirrel.ID Hectare Shift Date
## 1 -73.95613 40.79408 37F-PM-1014-03 37F PM 10142018
## 2 -73.96886 40.78378 21B-AM-1019-04 21B AM 10192018
## 3 -73.97428 40.77553 11B-PM-1014-08 11B PM 10142018
## 4 -73.95964 40.79031 32E-PM-1017-14 32E PM 10172018
## 5 -73.97027 40.77621 13E-AM-1017-05 13E AM 10172018
## 6 -73.96836 40.77259 11H-AM-1010-03 11H AM 10102018
## Hectare.Squirrel.Number Age Primary.Fur.Color Highlight.Fur.Color
## 1 3
## 2 4
## 3 8 Gray
## 4 14 Adult Gray
## 5 5 Adult Gray Cinnamon
## 6 3 Adult Cinnamon White
## Combination.of.Primary.and.Highlight.Color
## 1 +
## 2 +
## 3 Gray+
## 4 Gray+
## 5 Gray+Cinnamon
## 6 Cinnamon+White
## Color.notes
## 1
## 2
## 3
## 4 Nothing selected as Primary. Gray selected as Highlights. Made executive adjustments.
## 5
## 6
## Location Above.Ground.Sighter.Measurement Specific.Location Running
## 1 false
## 2 false
## 3 Above Ground 10 false
## 4 false
## 5 Above Ground on tree stump false
## 6 false
## Chasing Climbing Eating Foraging Other.Activities Kuks Quaas Moans
## 1 false false false false false false false
## 2 false false false false false false false
## 3 true false false false false false false
## 4 false false true true false false false
## 5 false false false true false false false
## 6 false false false true false false false
## Tail.flags Tail.twitches Approaches Indifferent Runs.from Other.Interactions
## 1 false false false false false
## 2 false false false false false
## 3 false false false false false
## 4 false false false false true
## 5 false false false false false
## 6 false true false true false
## Lat.Long
## 1 POINT (-73.9561344937861 40.7940823884086)
## 2 POINT (-73.9688574691102 40.7837825208444)
## 3 POINT (-73.97428114848522 40.775533619083)
## 4 POINT (-73.9596413903948 40.7903128889029)
## 5 POINT (-73.9702676472613 40.7762126854894)
## 6 POINT (-73.9683613516225 40.7725908847499)
squirrels_wide <- squirrels %>%
select(Unique.Squirrel.ID, Hectare, Age, Primary.Fur.Color, Highlight.Fur.Color,
Location, Running, Chasing, Climbing, Eating, Foraging
)
colnames(squirrels_wide) <- c('id', 'hectare', 'age', 'primary_color', 'highlight_color', 'location', 'Running', 'Chasing', 'Climbing', 'Eating', 'Foraging')
# Clean up true and false to numeric values: https://stackoverflow.com/questions/14737773/replacing-occurrences-of-a-number-in-multiple-columns-of-data-frame-with-another
squirrels_wide[squirrels_wide == "true" ] <- 1
squirrels_wide[squirrels_wide == "false" ] <- 0
# Create a tidier dataframe of squirrels and their actions
squirrel_actions <- squirrels_wide %>%
pivot_longer(c(`Running`, `Chasing`, `Climbing`, `Eating`, `Foraging`), names_to = "actions", values_to = "num_squirrels") %>%
filter(num_squirrels >0) %>%
select(id, actions)
# Drop action columns for Squirrels dataframe to just have demographic data
squirrels_wide <- squirrels_wide %>%
select(id, hectare, age, primary_color, highlight_color, location)
# Write to CSV
write.csv(squirrels_wide,'squirrels_data.csv',row.names = TRUE)
First let’s look at the distribution of different squirrel features:
ggplot(data = squirrels_wide, aes(x = age)) +
geom_bar(aes(color=primary_color))
ggplot(data = squirrels_wide, aes(x = location)) +
geom_bar(aes(color=primary_color))
ggplot(data = squirrels_wide, aes(x = primary_color)) +
geom_bar(aes(color=highlight_color))
We can also summarize those with group bys, learning that most observed squirrels are Adults, with the most common primary color being Gray and most common highlight Cinnamon. Most squirrels were observed on a Ground Plane.
squirrels_wide %>%
group_by(age) %>%
summarize(n_squirrels = n())
## `summarise()` ungrouping output (override with `.groups` argument)
## # A tibble: 4 x 2
## age n_squirrels
## <chr> <int>
## 1 "" 121
## 2 "?" 4
## 3 "Adult" 2568
## 4 "Juvenile" 330
squirrels_wide %>%
group_by(location) %>%
summarize(n_squirrels = n())
## `summarise()` ungrouping output (override with `.groups` argument)
## # A tibble: 3 x 2
## location n_squirrels
## <chr> <int>
## 1 "" 64
## 2 "Above Ground" 843
## 3 "Ground Plane" 2116
squirrels_wide %>%
group_by(primary_color) %>%
summarize(n_squirrels = n())
## `summarise()` ungrouping output (override with `.groups` argument)
## # A tibble: 4 x 2
## primary_color n_squirrels
## <chr> <int>
## 1 "" 55
## 2 "Black" 103
## 3 "Cinnamon" 392
## 4 "Gray" 2473
squirrels_wide %>%
group_by(highlight_color) %>%
summarize(n_squirrels = n())
## `summarise()` ungrouping output (override with `.groups` argument)
## # A tibble: 11 x 2
## highlight_color n_squirrels
## <chr> <int>
## 1 "" 1086
## 2 "Black" 34
## 3 "Black, Cinnamon" 9
## 4 "Black, Cinnamon, White" 32
## 5 "Black, White" 10
## 6 "Cinnamon" 767
## 7 "Cinnamon, White" 268
## 8 "Gray" 170
## 9 "Gray, Black" 3
## 10 "Gray, White" 59
## 11 "White" 585
We can do the same thing easily on our tidier actions dataset, finding that the most common action is Foraging!
squirrel_actions %>%
group_by(actions) %>%
summarize(n_squirrels = n())
## `summarise()` ungrouping output (override with `.groups` argument)
## # A tibble: 5 x 2
## actions n_squirrels
## <chr> <int>
## 1 Chasing 279
## 2 Climbing 658
## 3 Eating 760
## 4 Foraging 1435
## 5 Running 730
ggplot(data = squirrel_actions, aes(x = actions)) +
geom_bar()
Richard Zheng shared this:
One way you could work with this type of data is to transform it to a familiar tabular structure first using relevant key values. However you could also conduct analysis keeping the data’s format then dumping the results into a tidy, tabular, structure
There wasn’t an explicit analysis question, but I explored my own: for those with violations, how many are critical, and does that differ by establishment type?
Citation: State of New York, health.data.ny.gov/resource/cnih-y5dw.json.
# https://stackoverflow.com/questions/2061897/parse-json-with-r
url <- 'https://health.data.ny.gov/resource/cnih-y5dw.json'
# read url and convert to data.frame
sanitation <- fromJSON(txt=url)
First, we’ll familiarize with the data and begin tidying. There are a lot of computed columns we don’t want, so we’ll select that which we do and tidy from there.
head(sanitation)
## facility address
## 1 DUNKIN DONUTS 1-3-5 PARK STREET, OWEGO
## 2 Tim Hortons 154 Elm Street, Potsdam
## 3 LITTLE GIGGLES 11 QUEEN STREET, LYONS
## 4 WHAT'S THE SCOOP 20 MATTHEWS STREET, GOSHEN
## 5 LITTLE JAVA COFFEE 3425 WINTON PLACE, ROCHESTER
## 6 CHRIST CHURCH SED SITE 20 CARROLL STREET, POUGHKEEPSIE
## date violations total_critical_violations
## 1 2020-03-13T00:00:00.000 No violations found. 0
## 2 2020-08-13T00:00:00.000 No violations found. 0
## 3 2014-01-10T00:00:00.000 No violations found. 0
## 4 2020-08-25T00:00:00.000 No violations found. 0
## 5 2020-10-20T00:00:00.000 No violations found. 0
## 6 2019-07-15T00:00:00.000 No violations found. 0
## total_crit_not_corrected total_noncritical_violations
## 1 0 0
## 2 0 0
## 3 0 0
## 4 0 0
## 5 0 0
## 6 0 0
## description
## 1 Food Service Establishment - Frozen Desserts
## 2 Food Service Establishment - Restaurant
## 3 Institutional Food Service - Day Care Center Food Service
## 4 Food Service Establishment - Food Service Establishment
## 5 Food Service Establishment - Restaurant
## 6 SED Summer Feeding Prog. - SED Satellite Feeding Site
## local_health_department county facility_address city zip_code
## 1 Tioga County TIOGA 1-3-5 PARK STREET OWEGO 13827
## 2 Canton District Office ST LAWRENCE 154 Elm Street Potsdam 13676
## 3 Geneva District Office WAYNE 11 QUEEN STREET LYONS 14489
## 4 Orange County ORANGE 20 MATTHEWS STREET GOSHEN 10924
## 5 Monroe County MONROE 3425 WINTON PLACE ROCHESTER 14623
## 6 Dutchess County DUTCHESS 20 CARROLL STREET POUGHKEEPSIE 12603
## nysdoh_gazetteer_1980 municipality operation_name
## 1 532400 OWEGO DUNKIN DONUTS Frozen Desserts
## 2 442900 POTSDAM Parkway Express Potsdam
## 3 585400 LYONS LITTLE GIGGLES
## 4 352300 GOSHEN What's the Scoop
## 5 270104 ROCHESTER LITTLE JAVA COFFEE
## 6 130200 POUGHKEEPSIE CHRIST CHURCH SED SITE
## permit_expiration_date nys_health_operation_id inspection_type
## 1 2020-12-31T00:00:00.000 496351 Inspection
## 2 2021-10-31T00:00:00.000 1044784 Inspection
## 3 <NA> 845658 Inspection
## 4 2020-09-30T00:00:00.000 952934 Inspection
## 5 2019-12-31T00:00:00.000 669112 Inspection
## 6 <NA> 714485 Inspection
## food_service_facility_state location1.latitude location1.longitude
## 1 NY 42.101826 -76.262027
## 2 NY 44.669995 -74.9624
## 3 NY 43.06576 -76.992244
## 4 NY 41.39624 -74.334203
## 5 NY 43.09676 -77.578123
## 6 NY 41.698336 -73.92675
## :@computed_region_43an_4dx5 :@computed_region_9yqb_tdyd
## 1 450 630
## 2 1597 2140
## 3 315 631
## 4 1504 2134
## 5 1682 2093
## 6 1021 2040
## :@computed_region_assa_msit :@computed_region_5edz_4hdv permitted_corp_name
## 1 45 343 <NA>
## 2 40 918 <NA>
## 3 54 783 <NA>
## 4 32 47 Goshen Scoop LLC
## 5 56 785 <NA>
## 6 30 287 <NA>
## perm_operator_last_name perm_operator_first_name inspection_comments
## 1 <NA> <NA> <NA>
## 2 <NA> <NA> <NA>
## 3 <NA> <NA> <NA>
## 4 Mulroe James <NA>
## 5 <NA> <NA> No violations.
## 6 <NA> <NA> <NA>
## :@computed_region_8ire_itmf permitted_d_b_a
## 1 <NA> <NA>
## 2 <NA> <NA>
## 3 <NA> <NA>
## 4 <NA> <NA>
## 5 <NA> <NA>
## 6 <NA> <NA>
sanitation_wide <- sanitation %>%
select(nys_health_operation_id, facility, date, city, description, inspection_type, total_noncritical_violations, total_crit_not_corrected, total_critical_violations)
# Let's make our violations numeric
sanitation_wide$total_noncritical_violations <- as.numeric(sanitation_wide$total_noncritical_violations)
sanitation_wide$total_crit_not_corrected <- as.numeric(sanitation_wide$total_crit_not_corrected)
sanitation_wide$total_critical_violations <- as.numeric(sanitation_wide$total_critical_violations)
# Write to CSV
write.csv(sanitation_wide,'sanitation_wide',row.names = TRUE)
# We can pivot into a longer format, breaking violations into type and summing from there
sanitation_long <- sanitation_wide %>%
pivot_longer(c(`total_critical_violations`,`total_crit_not_corrected`,`total_noncritical_violations`), names_to = 'violation_type', values_to = 'num_violations')
# But I'm going we can break violation types into their own table, and sum the total here:
sanitation_all <- sanitation_wide %>%
mutate(total_violations = rowSums(cbind(total_critical_violations,total_crit_not_corrected,total_noncritical_violations)))
# I'm also going to filter for those with violations in a separate dataframe:
sanitation_violations <- sanitation_all %>%
filter(total_violations > 0)
My primary analysis question is how often are the violations in this data critical, and how does that differ by type of establishment.
critical_violation_rate <- sanitation_violations %>%
mutate(critical_rate = total_critical_violations/total_violations)
Looking at the distribution by type of establishment, we see very different counts and little activity (many at 0) for several types.
ggplot(data = critical_violation_rate, aes(x = critical_rate)) +
geom_histogram() +
facet_wrap(~description)
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
To get a clearer picture of overall critical rates, let’s group by Type and summarize.
We find that the largest rate of critical violations comes from SED Summer Feeding Prog. - SED Self Preparation Feeding Site, at 43% of their violations. These are only 7 violations from 2 establishments, though.
description_summary <-critical_violation_rate %>%
group_by(description) %>%
summarize(n_establishments = n(), critical_rate = sum(total_critical_violations)/sum(total_violations), total_violations = sum(total_violations))
## `summarise()` ungrouping output (override with `.groups` argument)
print(description_summary)
## # A tibble: 22 x 4
## description n_establishments critical_rate total_violations
## <chr> <int> <dbl> <dbl>
## 1 Food Service Establishment -… 8 0.04 25
## 2 Food Service Establishment -… 3 0 11
## 3 Food Service Establishment -… 14 0.109 64
## 4 Food Service Establishment -… 207 0.100 836
## 5 Food Service Establishment -… 1 0 9
## 6 Food Service Establishment -… 5 0.143 7
## 7 Food Service Establishment -… 241 0.103 967
## 8 Food Service Establishment -… 8 0.0455 22
## 9 Food Service Establishment -… 11 0.0625 48
## 10 Institutional Food Service -… 1 0 2
## # … with 12 more rows
summary(description_summary)
## description n_establishments critical_rate total_violations
## Length:22 Min. : 1.0 Min. :0.00000 Min. : 1.00
## Class :character 1st Qu.: 2.0 1st Qu.:0.00000 1st Qu.: 5.25
## Mode :character Median : 4.0 Median :0.07292 Median : 9.50
## Mean : 26.5 Mean :0.09142 Mean : 97.73
## 3rd Qu.: 9.5 3rd Qu.:0.10788 3rd Qu.: 24.25
## Max. :241.0 Max. :0.42857 Max. :967.00
Finally, let’s confirm the number of violations increases with the number of establishments in each type, to see if there are any over-contributors or under-contributors:
ggplot(data=description_summary,aes(total_violations,n_establishments)) + geom_point(aes(color=critical_rate))
There are a couple of large values making it a challenge to see, so we can quickly filter those out and look at the smaller values:
smaller_summary <- description_summary %>%
filter(n_establishments < 50)
ggplot(data=smaller_summary,aes(total_violations,n_establishments)) + geom_point(aes(color=critical_rate))
This plot suggests it is roughly true - as there are more establishments in each description type, there are more total violations.
Through analysis we discovered: