ggplot2
basicsDuring ANLY 512 we will be studying the theory and practice of
data visualization. We will be using R and the
packages within R to assemble data and construct many
different types of visualizations. We begin by studying some of the
theoretical aspects of visualization. To do that we must appreciate the
basic steps in the process of making a visualization.
The objective of this assignment is to complete and explain basic plots before moving on to more complicated ways to graph data.
A couple of tips, remember that there may be preprocessing involved in your graphics so you may have to do summaries or calculations to prepare, those should be included in your work.
To ensure accuracy pay close attention to axes and labels, you will be evaluated based on the accuracy and expository nature of your graphics. Make sure your axis labels are easy to understand and are comprised of full words with units if necessary.
Each question is worth 5 points.
To submit this homework you will create the document in Rstudio, using the knitr package (button included in Rstudio) and then submit the document to your Rpubs account. Once uploaded you will submit the link to that document on Canvas. Please make sure that this link is hyperlinked and that I can see the visualization and the code required to create it.
nasaweather package, create a
scatterplot between wind and pressure, with color being used to
distinguish the type of storm.nasa <- storms
plot1 <- ggplot(nasa,aes(x=wind, y=pressure, color = type))+
geom_point()+
labs(title = 'Wind Vs. Pressure')
plot1
MLB_teams data in the mdsr package
to create an informative data graphic that illustrates the relationship
between winning percentage and payroll in context.mlb <- MLB_teams
plot2 <- ggplot(mlb,aes(x = WPct,y=payroll)) +
theme_bw() +
scale_fill_brewer(palette = 'Dark2') +
geom_point() +
labs(title= 'Relationship between Winning Percentage and Payroll',
x = 'Winning %',
y = 'Payroll')+
geom_smooth(method = 'lm')
plot2
RailTrail data set from the mosaicData
package describes the usage of a rail trail in Western Massachusetts.
Use these data to answer the following questions.volume against the high temperature that dayweekday (an indicator
of weekend/holiday vs. weekday)rail <- RailTrail
plot3 <- ggplot(rail,aes(x=volume,y=hightemp)) +
geom_point() +
labs(title= 'High Temp Vs. Volume',
x = 'Number Of Crossings per Day',
y = 'High Temp(F)')
plot4<- ggplot(rail,aes(x=volume,y=hightemp))+
geom_point() +
geom_smooth(method = 'lm') +
facet_wrap(~dayType, ncol = 2) +
labs(title = 'High Temp Vs. Volume by weekday/weekends',
x = "Number Of Crossings per Day",
y = "High Temp(F)")
#a.
plot3
#b & c.
plot4
nasaweather package, use the
geom_path function to plot the path of each tropical storm
in the storms data table. Use color to distinguish the
storms from one another, and use faceting to plot each year in its own
panel.plot5<- ggplot(nasa,aes(x=long,y=lat))+
geom_path(aes(color = name))+
facet_wrap(~year)+
labs(title = "Path of Tropical Storms",
col = "Strom Names",
x = "Longitude",
y = "Latitude")
plot5
penguins data set from the
palmerpenguins package.penguins
## # A tibble: 344 x 8
## species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g
## <fct> <fct> <dbl> <dbl> <int> <int>
## 1 Adelie Torgersen 39.1 18.7 181 3750
## 2 Adelie Torgersen 39.5 17.4 186 3800
## 3 Adelie Torgersen 40.3 18 195 3250
## 4 Adelie Torgersen NA NA NA NA
## 5 Adelie Torgersen 36.7 19.3 193 3450
## 6 Adelie Torgersen 39.3 20.6 190 3650
## 7 Adelie Torgersen 38.9 17.8 181 3625
## 8 Adelie Torgersen 39.2 19.6 195 4675
## 9 Adelie Torgersen 34.1 18.1 193 3475
## 10 Adelie Torgersen 42 20.2 190 4250
## # ... with 334 more rows, and 2 more variables: sex <fct>, year <int>
#a. bill depth and bill length have postive correlation amaong three species
plot6<- ggplot(penguins,aes(x=bill_length_mm ,y=bill_depth_mm, color = species)) +
geom_point() +
geom_smooth(method='lm')
labs(title = "Scatterplot of Bill Length and Bill Depth by Species",
x = "Bill Length",
y = "Bill Depth")
## $x
## [1] "Bill Length"
##
## $y
## [1] "Bill Depth"
##
## $title
## [1] "Scatterplot of Bill Length and Bill Depth by Species"
##
## attr(,"class")
## [1] "labels"
plot6
#b.
plot7<- ggplot(penguins,aes(x=bill_length_mm ,y=bill_depth_mm)) +
geom_point() +
facet_wrap(~species)+
geom_smooth(method='lm')
labs(title = "Scatterplot of Bill Length and Bill Depth faceted by Species",
x = "Bill Length",
y = "Bill Depth")
## $x
## [1] "Bill Length"
##
## $y
## [1] "Bill Depth"
##
## $title
## [1] "Scatterplot of Bill Length and Bill Depth faceted by Species"
##
## attr(,"class")
## [1] "labels"
plot7