Introduced by Ronald Fisher in 1936
The data set consists of 50 samples from each of three species of Iris (Iris setosa, Iris virginica, and Iris versicolor).
Four features were measured from each sample: the length and the width of the sepals and petals, in centimetres. Based on the combination of these four features, Fisher developed a model to distinguish the species from each other.
Source: Wikipedia
with(iris, plot(x = Petal.Width, y = Sepal.Length))
qplot(Petal.Width, Sepal.Length, data = iris)
plotly
ggiris <- qplot(Petal.Width, Sepal.Length, data = iris)
ggplotly(ggiris)
plotly
ggiris_colored <- qplot(Petal.Width, Sepal.Length, data = iris,
color = Species)
ggplotly(ggiris_colored)
iris %>% plot_ly(x = Petal.Width, y = Sepal.Length,
type = "scatter", color = Species, mode = "markers")
Based off analysis done by Rich Majerus in 2014 using the googleVis
package
Data does not include 143 interdisciplinary majors and 9 undecided majors.
Majors like Bio/Chem are split between the two departments
General Lit/Lit majors are included with English
Dance majors and faculty are included with Theatre
major_data %>% ggplot(aes(x = Majors, y = FTE)) +
geom_point() +
ggtitle("Reed College Majors and FTE by Department")
Left-click and drag to select an area of the chart to zoom on. Right-click to zoom back out.
The pnwflights14
package provides information contains information about all flights that departed from SEA in Seattle and PDX in Portland, in 2014: 162,049 flights in total.
We can use this data and the dplyr
package to look at daily maximum departure delays throughout the year for Alaskan Airlines.
alaskan %>% ggplot(aes(x = date2014, y = max_dep_delay)) +
geom_line() +
scale_x_date(date_breaks = "1 month", date_labels = "%b %y") +
xlab("Date") +
ylab("Maximum Departure Delay")
ggplotly()
dygraph
(Converting to time series format using xts
)
alaskan_ts <- xts(alaskan$max_dep_delay, alaskan$date2014)
colnames(alaskan_ts) <- "Max Departure Delay"
dygraph(alaskan_ts) %>% dyRangeSelector()
Canada is an extremely large land mass (2nd largest country in the world), but is only the 37th largest country in terms of population
The US ranks 4th highest in land mass and 3rd highest in population
We can use data in the maps
package to better visualize why these rankings exist
data(canada.cities, package = "maps")
canada_plot <- ggplot(canada.cities, aes(x = long, y = lat)) +
coord_equal() +
geom_point(aes(size=pop, text = paste0(name, ",",
"Pop: ", prettyNum(pop, big.mark = ",", scientific = FALSE))),
colour = "red", alpha = 1/2) +
borders(regions="canada")
canada_plot
ggplotly(canada_plot)
data(us.cities, package = "maps")
us_plot <- ggplot(us.cities, aes(x = long, y = lat)) +
coord_equal() +
geom_point(aes(size=pop, text = paste0(name, ",",
"Pop: ", prettyNum(pop, big.mark = ",", scientific = FALSE))),
colour = "red", alpha = 1/2) +
borders(regions="usa", xlim = c(-200, -60), ylim = c(20, 80))
us_plot
ggplotly(us_plot)
plot_ly(z = volcano, type = "surface")
datatable(iris, options = list(pageLength = 5))
Plotting maps in R with ggplot2
sessionInfo()
## R version 3.2.3 (2015-12-10)
## Platform: x86_64-apple-darwin13.4.0 (64-bit)
## Running under: OS X 10.11.2 (El Capitan)
##
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] DT_0.1 googleVis_0.5.10 maps_3.0.2 xts_0.9-7
## [5] zoo_1.7-12 readr_0.2.2 knitr_1.12 dplyr_0.4.3
## [9] dygraphs_0.6 pnwflights14_0.1.0.9000 plotly_2.3.0 ggplot2_2.0.0
## [13] revealjs_0.5
##
## loaded via a namespace (and not attached):
## [1] Rcpp_0.12.3 RColorBrewer_1.1-2 formatR_1.2.1 plyr_1.8.3 base64enc_0.1-3
## [6] viridis_0.3.2 tools_3.2.3 digest_0.6.9 jsonlite_0.9.19 evaluate_0.8
## [11] gtable_0.1.2 lattice_0.20-33 DBI_0.3.1 yaml_2.1.13 parallel_3.2.3
## [16] gridExtra_2.0.0 httr_1.0.0 stringr_1.0.0 htmlwidgets_0.5 grid_3.2.3
## [21] R6_2.1.1 rmarkdown_0.9.5 RJSONIO_1.3-0 magrittr_1.5 scales_0.3.0
## [26] htmltools_0.3 assertthat_0.1 colorspace_1.2-6 labeling_0.3 stringi_1.0-1
## [31] lazyeval_0.1.10 munsell_0.4.2