Computing statistics of data and presenting with visualizations
knitr::opts_chunk$set(echo = TRUE)
library(tidyverse)
library(dplyr)
setwd("C:/Users/steph/OneDrive/Documents/R")
getwd()
[1] "C:/Users/steph/OneDrive/Documents/R"
[1] "Museum ID"
[2] "Museum Name"
[3] "Legal Name"
[4] "Alternate Name"
[5] "Museum Type"
[6] "Institution Name"
[7] "Street Address (Administrative Location)"
[8] "City (Administrative Location)"
[9] "State (Administrative Location)"
[10] "Zip Code (Administrative Location)"
[11] "Street Address (Physical Location)"
[12] "City (Physical Location)"
[13] "State (Physical Location)"
[14] "Zip Code (Physical Location)"
[15] "Phone Number"
[16] "Latitude"
[17] "Longitude"
[18] "Locale Code (NCES)"
[19] "County Code (FIPS)"
[20] "State Code (FIPS)"
[21] "Region Code (AAM)"
[22] "Employer ID Number"
[23] "Tax Period"
[24] "Income"
[25] "Revenue"
#Rename columns
colnames(mus) <- c("museum.id", "museum.name", "legal.name", "alt.name", "museum.type", "inst.name", "admin.address", "admin.city", "admin.state", "admin.zip.code", "physical.address", "physical.city", "physical.state", "physical.zip.code", "phone.number", "latitude", "longitude", "locale.code", "country.code", "state.code", "region.code", "employer.id", "tax.period", "income", "revenue")
colnames(mus)
[1] "museum.id" "museum.name" "legal.name"
[4] "alt.name" "museum.type" "inst.name"
[7] "admin.address" "admin.city" "admin.state"
[10] "admin.zip.code" "physical.address" "physical.city"
[13] "physical.state" "physical.zip.code" "phone.number"
[16] "latitude" "longitude" "locale.code"
[19] "country.code" "state.code" "region.code"
[22] "employer.id" "tax.period" "income"
[25] "revenue"
#Create table of types of museums
type <- table(mus$museum.type)
type
ARBORETUM, BOTANICAL GARDEN, OR NATURE CENTER
1484
ART MUSEUM
3241
CHILDREN'S MUSEUM
512
GENERAL MUSEUM
8699
HISTORIC PRESERVATION
14861
HISTORY MUSEUM
2284
NATURAL HISTORY MUSEUM
346
SCIENCE & TECHNOLOGY MUSEUM OR PLANETARIUM
1081
ZOO, AQUARIUM, OR WILDLIFE CONSERVATION
564
#Create a proportional table of Museum Type
prop.table(type)
ARBORETUM, BOTANICAL GARDEN, OR NATURE CENTER
0.04487179
ART MUSEUM
0.09799831
CHILDREN'S MUSEUM
0.01548137
GENERAL MUSEUM
0.26303217
HISTORIC PRESERVATION
0.44935293
HISTORY MUSEUM
0.06906144
NATURAL HISTORY MUSEUM
0.01046202
SCIENCE & TECHNOLOGY MUSEUM OR PLANETARIUM
0.03268626
ZOO, AQUARIUM, OR WILDLIFE CONSERVATION
0.01705370
#Create a table to view museum type by state
table(mus$museum.type, mus$admin.state)
AK AL AR AZ
ARBORETUM, BOTANICAL GARDEN, OR NATURE CENTER 10 16 13 21
ART MUSEUM 11 48 34 62
CHILDREN'S MUSEUM 2 4 3 9
GENERAL MUSEUM 43 127 104 114
HISTORIC PRESERVATION 72 192 120 176
HISTORY MUSEUM 14 40 28 26
NATURAL HISTORY MUSEUM 2 7 1 7
SCIENCE & TECHNOLOGY MUSEUM OR PLANETARIUM 4 23 10 21
ZOO, AQUARIUM, OR WILDLIFE CONSERVATION 4 6 4 13
CA CO CT DC
ARBORETUM, BOTANICAL GARDEN, OR NATURE CENTER 136 27 21 15
ART MUSEUM 343 66 40 23
CHILDREN'S MUSEUM 66 12 14 1
GENERAL MUSEUM 787 191 119 95
HISTORIC PRESERVATION 949 259 291 40
HISTORY MUSEUM 197 42 24 10
NATURAL HISTORY MUSEUM 49 12 8 0
SCIENCE & TECHNOLOGY MUSEUM OR PLANETARIUM 96 28 23 5
ZOO, AQUARIUM, OR WILDLIFE CONSERVATION 47 12 4 1
DE FL GA HI
ARBORETUM, BOTANICAL GARDEN, OR NATURE CENTER 13 71 32 21
ART MUSEUM 13 135 71 18
CHILDREN'S MUSEUM 1 22 8 3
GENERAL MUSEUM 27 312 185 55
HISTORIC PRESERVATION 64 421 277 48
HISTORY MUSEUM 4 73 44 10
NATURAL HISTORY MUSEUM 3 16 9 3
SCIENCE & TECHNOLOGY MUSEUM OR PLANETARIUM 3 57 33 8
ZOO, AQUARIUM, OR WILDLIFE CONSERVATION 4 42 9 10
IA ID IL IN
ARBORETUM, BOTANICAL GARDEN, OR NATURE CENTER 22 9 64 25
ART MUSEUM 67 22 121 54
CHILDREN'S MUSEUM 5 2 26 11
GENERAL MUSEUM 149 46 338 188
HISTORIC PRESERVATION 354 106 622 321
HISTORY MUSEUM 43 11 83 53
NATURAL HISTORY MUSEUM 5 3 10 7
SCIENCE & TECHNOLOGY MUSEUM OR PLANETARIUM 11 8 29 17
ZOO, AQUARIUM, OR WILDLIFE CONSERVATION 5 7 17 9
KS KY LA MA
ARBORETUM, BOTANICAL GARDEN, OR NATURE CENTER 15 29 21 34
ART MUSEUM 31 41 40 123
CHILDREN'S MUSEUM 6 5 11 17
GENERAL MUSEUM 137 123 130 229
HISTORIC PRESERVATION 275 203 128 522
HISTORY MUSEUM 34 34 34 55
NATURAL HISTORY MUSEUM 10 6 4 11
SCIENCE & TECHNOLOGY MUSEUM OR PLANETARIUM 16 20 16 26
ZOO, AQUARIUM, OR WILDLIFE CONSERVATION 13 5 11 11
MD ME MI MN
ARBORETUM, BOTANICAL GARDEN, OR NATURE CENTER 28 16 63 28
ART MUSEUM 66 28 80 74
CHILDREN'S MUSEUM 6 5 18 8
GENERAL MUSEUM 176 105 215 136
HISTORIC PRESERVATION 272 323 541 365
HISTORY MUSEUM 44 22 63 42
NATURAL HISTORY MUSEUM 1 4 10 7
SCIENCE & TECHNOLOGY MUSEUM OR PLANETARIUM 19 13 30 11
ZOO, AQUARIUM, OR WILDLIFE CONSERVATION 6 5 19 10
MO MS MT NC
ARBORETUM, BOTANICAL GARDEN, OR NATURE CENTER 16 5 7 39
ART MUSEUM 69 29 29 81
CHILDREN'S MUSEUM 5 6 7 18
GENERAL MUSEUM 206 72 70 192
HISTORIC PRESERVATION 370 104 105 330
HISTORY MUSEUM 43 19 29 65
NATURAL HISTORY MUSEUM 8 1 6 6
SCIENCE & TECHNOLOGY MUSEUM OR PLANETARIUM 11 5 11 39
ZOO, AQUARIUM, OR WILDLIFE CONSERVATION 14 6 6 9
ND NE NH NJ
ARBORETUM, BOTANICAL GARDEN, OR NATURE CENTER 3 18 8 37
ART MUSEUM 20 26 19 65
CHILDREN'S MUSEUM 3 6 6 9
GENERAL MUSEUM 67 89 60 149
HISTORIC PRESERVATION 157 163 238 424
HISTORY MUSEUM 8 21 18 42
NATURAL HISTORY MUSEUM 1 3 0 6
SCIENCE & TECHNOLOGY MUSEUM OR PLANETARIUM 4 11 11 17
ZOO, AQUARIUM, OR WILDLIFE CONSERVATION 6 10 3 7
NM NV NY OH
ARBORETUM, BOTANICAL GARDEN, OR NATURE CENTER 7 9 96 62
ART MUSEUM 37 17 248 118
CHILDREN'S MUSEUM 4 2 33 12
GENERAL MUSEUM 116 60 547 292
HISTORIC PRESERVATION 95 57 1058 727
HISTORY MUSEUM 25 14 140 79
NATURAL HISTORY MUSEUM 10 1 15 6
SCIENCE & TECHNOLOGY MUSEUM OR PLANETARIUM 19 8 64 48
ZOO, AQUARIUM, OR WILDLIFE CONSERVATION 7 6 38 19
OK OR PA RI
ARBORETUM, BOTANICAL GARDEN, OR NATURE CENTER 10 27 77 7
ART MUSEUM 36 53 159 21
CHILDREN'S MUSEUM 11 8 10 1
GENERAL MUSEUM 206 161 358 35
HISTORIC PRESERVATION 214 238 861 91
HISTORY MUSEUM 38 35 113 14
NATURAL HISTORY MUSEUM 7 8 9 3
SCIENCE & TECHNOLOGY MUSEUM OR PLANETARIUM 16 14 45 4
ZOO, AQUARIUM, OR WILDLIFE CONSERVATION 9 6 21 5
SC SD TN TX
ARBORETUM, BOTANICAL GARDEN, OR NATURE CENTER 17 4 29 82
ART MUSEUM 37 25 51 168
CHILDREN'S MUSEUM 9 1 11 26
GENERAL MUSEUM 113 72 172 646
HISTORIC PRESERVATION 154 96 217 649
HISTORY MUSEUM 30 18 47 198
NATURAL HISTORY MUSEUM 3 3 4 17
SCIENCE & TECHNOLOGY MUSEUM OR PLANETARIUM 16 7 20 61
ZOO, AQUARIUM, OR WILDLIFE CONSERVATION 12 8 8 39
UT VA VT WA
ARBORETUM, BOTANICAL GARDEN, OR NATURE CENTER 18 45 10 40
ART MUSEUM 20 88 17 81
CHILDREN'S MUSEUM 4 14 2 17
GENERAL MUSEUM 55 258 51 186
HISTORIC PRESERVATION 35 410 182 298
HISTORY MUSEUM 11 94 15 57
NATURAL HISTORY MUSEUM 10 5 4 7
SCIENCE & TECHNOLOGY MUSEUM OR PLANETARIUM 11 34 10 24
ZOO, AQUARIUM, OR WILDLIFE CONSERVATION 6 10 1 11
WI WV WY
ARBORETUM, BOTANICAL GARDEN, OR NATURE CENTER 44 5 12
ART MUSEUM 76 24 16
CHILDREN'S MUSEUM 14 3 5
GENERAL MUSEUM 188 69 78
HISTORIC PRESERVATION 476 121 50
HISTORY MUSEUM 41 21 19
NATURAL HISTORY MUSEUM 4 1 3
SCIENCE & TECHNOLOGY MUSEUM OR PLANETARIUM 24 13 7
ZOO, AQUARIUM, OR WILDLIFE CONSERVATION 20 1 2
#Find median revenue of each state
mus %>%
group_by(admin.state) %>%
select(revenue) %>%
summarize_all(median, na.rm = TRUE)
# A tibble: 51 × 2
admin.state revenue
<chr> <dbl>
1 AK 32716
2 AL 0
3 AR 0
4 AZ 300
5 CA 29661
6 CO 5200
7 CT 31295
8 DC 677091
9 DE 24696
10 FL 18774.
# … with 41 more rows
#Find mean revenue by museum type
mus %>%
group_by(museum.type) %>%
select(revenue) %>%
summarize_all(mean, na.rm = TRUE)
# A tibble: 9 × 2
museum.type revenue
<chr> <dbl>
1 ARBORETUM, BOTANICAL GARDEN, OR NATURE CENTER 77879222.
2 ART MUSEUM 100787437.
3 CHILDREN'S MUSEUM 1355364.
4 GENERAL MUSEUM 27402842.
5 HISTORIC PRESERVATION 2035602.
6 HISTORY MUSEUM 9402442.
7 NATURAL HISTORY MUSEUM 85498324.
8 SCIENCE & TECHNOLOGY MUSEUM OR PLANETARIUM 114309309.
9 ZOO, AQUARIUM, OR WILDLIFE CONSERVATION 5483602.
#Find most common museum type by state
mus %>%
group_by(admin.state) %>%
count(museum.type) %>%
slice(which.max(n)) %>%
select(-n)
# A tibble: 51 × 2
# Groups: admin.state [51]
admin.state museum.type
<chr> <chr>
1 AK HISTORIC PRESERVATION
2 AL HISTORIC PRESERVATION
3 AR HISTORIC PRESERVATION
4 AZ HISTORIC PRESERVATION
5 CA HISTORIC PRESERVATION
6 CO HISTORIC PRESERVATION
7 CT HISTORIC PRESERVATION
8 DC GENERAL MUSEUM
9 DE HISTORIC PRESERVATION
10 FL HISTORIC PRESERVATION
# … with 41 more rows
#Find the mode revenue by state
mus %>%
group_by(admin.state) %>%
select(revenue) %>%
summarize_all(mean, na.rm = TRUE)
# A tibble: 51 × 2
admin.state revenue
<chr> <dbl>
1 AK 1004114.
2 AL 2023465.
3 AR 3234783.
4 AZ 23835153.
5 CA 27900663.
6 CO 9701295.
7 CT 72082992.
8 DC 222933794.
9 DE 46325609.
10 FL 12537256.
# … with 41 more rows
#Find Standard Deviation of Revenue
sd(mus$revenue, na.rm=TRUE)
[1] 248519659
ART MUSEUM 3241 CHILDREN’S MUSEUM 512 GENERAL MUSEUM 8699 HISTORIC PRESERVATION 14861 HISTORY MUSEUM 2284 NATURAL HISTORY MUSEUM 346 SCIENCE & TECHNOLOGY MUSEUM OR PLANETARIUM 1081 ZOO, AQUARIUM, OR WILDLIFE CONSERVATION 564
#What are the most common types of museums in the United States?
mt <- mus%>%
mutate(mus.type = case_when(
(museum.type == "ARBORETUM, BOTANICAL GARDEN, OR NATURE CENTER") ~ "Garden",
museum.type == "ART MUSEUM" ~ "Art",
museum.type == "CHILDREN'S MUSEUM" ~ "Children",
museum.type == "GENERAL MUSEUM" ~ "General",
museum.type == "HISTORIC PRESERVATION" ~ "Preservation",
museum.type == "HISTORY MUSEUM" ~ "History",
museum.type == "NATURAL HISTORY MUSEUM" ~ "Natrual History",
museum.type == "SCIENCE & TECHNOLOGY MUSEUM OR PLANETARIUM" ~ "Science",
museum.type == "ZOO, AQUARIUM, OR WILDLIFE CONSERVATION " ~ "Zoo",
))
mt%>%
drop_na(mus.type)%>%
ggplot(mus, mapping = aes(x=mus.type))+
geom_bar()+
coord_flip()
The bar graph, which counts the number of recorded museums by type, indicates that (historic) preservations are the most common type of museum in the US.
#How does revenue relate to museum type and state?
ggplot(mt, aes(x= museum.type, y= revenue, fill = admin.state))+
geom_bar(stat="identity")
The bivariate bar chart shows that the type of museum with the highest reported revenue is art museums. The color illustrate the distribution of types of museums across different states. No apparent significant conclusions noted based on state with the current visual.
The limitations include the use of the administrative location for the state variable due to numerous missing data within the physical location state variable. This may contribute to screwed location data as some administration locations are not in the same state as the museum it runs. Another limit to consider is that some museums across various types are free to public access and therefore do not produce a revenue. Therefore, attendance should not be inferred by the total revenue of museums in this data.
Different types of visualizations are still being manipulated on this data. Thusfar, bar charts have proven to present the discrete vs continuous data in the most visually appealing manner. I will continue to with different charts and variables.