Lab 2 Assignment

Complete ALL of the essentials below correctly to earn an ‘S’ on the lab.
Complete the Depth portion successful to earn credit toward a depth boost (every 2 lab depth assignments completed earns a 1/3 letter grade boost to your final grade)

Render your document as a .pdf or .html and submit it to the google folder on Moodle for grading.

Load Packages

library(tidyverse)
── Attaching packages ─────────────────────────────────────── tidyverse 1.3.2 ──
✔ ggplot2 3.4.0      ✔ purrr   1.0.1 
✔ tibble  3.1.8      ✔ dplyr   1.0.10
✔ tidyr   1.3.0      ✔ stringr 1.5.0 
✔ readr   2.1.3      ✔ forcats 0.5.2 
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
library(patchwork)
library(ggsci)

Load data to use!

We can use this fun ferris wheels data I found online to make some practice graphs!

wheels <- read.csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2022/2022-08-09/wheels.csv')

head(wheels)
  X                name height diameter     opened     closed country
1 1 360 Pensacola Beach 200.00       NA 2012-07-03 2013-01-01     USA
2 2              Amuran 303.00    199.8 2004-01-01       <NA>   Japan
3 3       Asiatique Sky 200.00    200.0 2012-12-15       <NA> Tailand
4 4        Aurora Wheel 295.00    272.0       <NA>       <NA>   Japan
5 5         Baghdad Eye 180.00       NA 2011-01-01       <NA>    Iraq
6 6 Beijing Great Wheel 692.64    642.7       <NA>       <NA>   China
                         location number_of_cabins passengers_per_cabin
1        Pensacola Beach; Florida               42                    6
2               Kagoshima; Kyushu               36                   NA
3        Asiatique the Riverfront               42                   NA
4 Nagashima Spa Land; Mie; Honshu               NA                   NA
5         Al-Zawraa Park; Baghdad               40                    6
6          Chaoyang Park; Beijing               48                   40
  seating_capacity hourly_capacity ride_duration_minutes climate_controlled
1              252            1260                  12.0                Yes
2               NA              NA                  14.5                Yes
3               NA              NA                    NA                Yes
4               NA              NA                    NA               <NA>
5              240             960                  15.0               <NA>
6             1920            5760                  20.0                yes
  construction_cost    status         design_manufacturer          type
1           Unknown     Moved        Realty Masters of FL Transportable
2           Unknown Operating                        <NA>          <NA>
3           Unknown Operating       Dutch Wheels (Vekoma)          <NA>
4           Unknown Operating                        <NA>         Fixed
5    $6 million USD Operating                        <NA>          <NA>
6  $290 million USD   Delayed The Great Wheel Corporation         Fixed
  vip_area ticket_cost_to_ride                  official_website turns
1      Yes                <NA>                              <NA>     4
2     <NA>                <NA>                              <NA>     1
3     <NA>                <NA>      http://www.asiatiquesky.com/    NA
4     <NA>                <NA> http://www.nagashima-onsen.co.jp/    NA
5     <NA>                 3.5                              <NA>    NA
6     <NA>                <NA>                              <NA>     1

Essentials

1.) What’s in a graph?
Write a paragraph explaining some tenants of good vs. bad graphics. Be specific!   A good graph should accurately represent whatever data you want to visualize. The center and spread should be represented in whatever way makes the most sense, and you should be able to see the numbers. A good graph should also reveal any patterns that are present in the data. The axes and other graphical elements that might be present should be clearly depicted and labelled. The graph itself should be easy to understand. By that, I mean that not every graph type is the one for your data– the type of graph should best visualize your data in a clear, easy to understand way. A bad graph might not be clearly labelled or represent the data in a readable way. A bad graph might also display the magnitude of the data in a dishonest way because it makes it look more shocking or exciting.

2.) Make the following plots with the ferris wheel data: histogram, boxplot, bar graph, line graph, scatterplot

wheels2<-wheels %>%
  drop_na(country, height, climate_controlled,status,diameter)
ggplot(data=wheels, aes(number_of_cabins))+
  geom_histogram(binwidth=3)+
  theme_bw()
Warning: Removed 11 rows containing non-finite values (`stat_bin()`).

ggplot(data=wheels2,aes(x=status,y=height))+
  geom_boxplot()+
  theme(axis.text.x=element_text(angle=90, vjust=0.5,size=6))

ggplot(data=wheels2,aes(country))+
  geom_bar()+
  theme_bw()+
  theme(axis.text.x=element_text(angle=90,vjust=0.5,size=10))

ggplot(data=wheels2, aes(x=height,y=diameter))+
  geom_line()+
  theme_bw()

ggplot(data=wheels2,aes(x=climate_controlled,y=ride_duration_minutes))+
  geom_point()+
  theme_bw()
Warning: Removed 2 rows containing missing values (`geom_point()`).

3.) Using your scatterplot from #2 and remove the grey background from the plot. Continue using this same plot for 3-6

ggplot(data=wheels,aes(x=climate_controlled,y=ride_duration_minutes))+
  geom_point()+
  theme_bw()
Warning: Removed 12 rows containing missing values (`geom_point()`).

4.) Change the colors away from default colors. Show me an example of manually changing the colors and an example of you using ggsci to change the colors.

#Conditionally#
ggplot(data=wheels2,aes(x=climate_controlled,y=ride_duration_minutes,color=country))+
  geom_point()+
  theme_bw()
Warning: Removed 2 rows containing missing values (`geom_point()`).

#With ggsci#
ggplot(data=wheels2,aes(x=status,y=height))+
  geom_boxplot(aes(fill=status))+
  scale_fill_aaas()+
  theme(axis.text.x=element_text(angle=90, vjust=0.5,size=6))

5.) Change the shape and size of your points!

ggplot(data=wheels2,aes(x=climate_controlled,y=ride_duration_minutes,size=status))+
  geom_point(shape=8)+
  theme_bw()
Warning: Using size for a discrete variable is not advised.
Warning: Removed 2 rows containing missing values (`geom_point()`).

6.) Add lines to connect your points to one another.

ggplot(data=wheels2,aes(x=climate_controlled,y=ride_duration_minutes))+
  geom_line()+
  geom_point()
Warning: Removed 2 rows containing missing values (`geom_point()`).

7.) Using your bar graph, change both the color and fill and see how those are different.

ggplot(data=wheels2,aes(country,color=country))+
  geom_bar()+
  theme_bw()+
  theme(axis.text.x=element_text(angle=90,vjust=0.5,size=10))

ggplot(data=wheels2,aes(country,fill=country))+
  geom_bar()+
  theme_bw()+
  theme(axis.text.x=element_text(angle=90,vjust=0.5,size=10))

8.) Use facet_wrap to alter one of your graphs!

ggplot(data=wheels2, aes(x=height,y=diameter))+
  geom_line()+
  facet_wrap(~climate_controlled)+
  theme_bw()
`geom_line()`: Each group consists of only one observation.
ℹ Do you need to adjust the group aesthetic?

Depth

1.) Using patchwork, put all of your graphs from Essentials #2 on the same output page!

p1<-ggplot(data=wheels, aes(number_of_cabins))+
  geom_histogram(binwidth=3)+
  theme_bw()
p2<-ggplot(data=wheels2,aes(x=status,y=height))+
  geom_boxplot()+
  theme(axis.text.x=element_text(angle=90, vjust=0.5,size=6))
p3<-ggplot(data=wheels2,aes(country))+
  geom_bar()+
  theme_bw()+
  theme(axis.text.x=element_text(angle=90,vjust=0.5,size=10))
p4<-ggplot(data=wheels2, aes(x=height,y=diameter))+
  geom_line()+
  theme_bw()
p5<- ggplot(data=wheels2,aes(x=climate_controlled,y=ride_duration_minutes))+
  geom_point()+
  theme_bw()
library(patchwork)
(p1+p2+p3)/(p4+p5)
Warning: Removed 11 rows containing non-finite values (`stat_bin()`).
Warning: Removed 2 rows containing missing values (`geom_point()`).

2.) Using the wheels data, group data, calculate an average, and plot data with error bars! Ask for help if you need it. We may not have learned all of this in class just yet.

heightmean<-wheels2 %>% 
group_by(country) %>% 
drop_na(height,country) %>%
summarize(heightmean = mean(height), sd=sd(height),n=n(),se=sd/sqrt(n)) 
heightmean
# A tibble: 11 × 5
   country      heightmean    sd     n    se
   <chr>             <dbl> <dbl> <int> <dbl>
 1 Canada             175   NA       1  NA  
 2 China              466. 123.      6  50.2
 3 Japan              317.  74.0     6  30.2
 4 Mexico             262   NA       1  NA  
 5 Russia             240   NA       1  NA  
 6 Singapore          541   NA       1  NA  
 7 Tailand            200   NA       1  NA  
 8 Taiwan             316   NA       1  NA  
 9 Turkmenistan       190   NA       1  NA  
10 UK                 304. 142.      3  81.7
11 USA                272  227.      6  92.9
ggplot(data=heightmean, aes(x=country, y=heightmean, color=country))+
  geom_point()+
   theme(axis.text.x=element_text(angle=90,vjust=0.5,size=10))+
  geom_errorbar(data=heightmean, aes(x=country, ymin=heightmean-se, ymax=heightmean+se), width=0.2)

3.) Using theme() and labs() add custom labels to your X and Y axis. Add a title. Change the size of the text on both axes. This may be beyond our tutorial in class, so ask Justin and/or use google or resources linked above. Theme() is extremely powerful and will always be useful for us!

ggplot(data=wheels2,aes(country,fill=country))+
  geom_bar()+
  theme_bw()+
  theme(axis.text.x=element_text(angle=90,vjust=0.5,size=10))+
  theme(axis.text=element_text(size=10))+
  labs(x='Country',y='Count',title='Count of Ferris Wheels by Country')


When finished, render as html or pdf and confirm that your file looks the way it should. Then submit on Moodle (via the google form).