library(treemap)
Registered S3 method overwritten by 'data.table':
method from
print.data.table
library(tidyverse)
Registered S3 methods overwritten by 'dbplyr':
method from
print.tbl_lazy
print.tbl_sql
── Attaching packages ──────────────────────────────────────────────────────────────────────────────── tidyverse 1.3.0 ──
✓ ggplot2 3.3.3 ✓ purrr 0.3.4
✓ tibble 3.1.0 ✓ dplyr 1.0.4
✓ tidyr 1.1.2 ✓ stringr 1.4.0
✓ readr 1.4.0 ✓ forcats 0.5.0
── Conflicts ─────────────────────────────────────────────────────────────────────────────────── tidyverse_conflicts() ──
x tidyr::expand() masks reshape::expand()
x dplyr::filter() masks stats::filter()
x dplyr::lag() masks stats::lag()
x dplyr::rename() masks reshape::rename()
library(RColorBrewer)
data <- read.csv("http://datasets.flowingdata.com/post-data.txt")
head(data)
treemap(data, index="category", vSize="views",
vColor="comments", type="value",
palette="RdYlBu")
treemap(data, index="category", vSize="views",
vColor="comments", type="manual",
palette="RdYlBu")
treemap(data, index="category", vSize="views",
vColor="comments", type="manual",
palette="RdYlBu")
nba <- nba[order(nba$PTS),]
row.names(nba) <- nba$Name
nba <- nba[,2:19]
nba_matrix <- data.matrix(nba)
nba_heatmap <- heatmap(nba_matrix, Rowv=NA, Colv=NA,
col = cm.colors(256), scale="column", margins=c(5,10),
xlab = "NBA Player Stats",
ylab = "NBA Players",
main = "NBA Payer Stats in 2008")
library(viridis)
Loading required package: viridisLite
nba_heatmap <- heatmap(nba_matrix, Rowv=NA, col = viridis(25, direction = -1),
scale="column", margins=c(5,10),
xlab = "NBA Player Stats",
ylab = "NBA Players",
main = "NBA Payer Stats in 2008")
library(nycflights13)
library(RColorBrewer)
#view(flights)
flights_nona <- na.omit (flights) # remove observations with NA values
delays <- flights_nona %>% # create a delays dataframe by:
group_by (dest) %>% # grouping by point of destination
summarize (count = n(), # create these variables: number of flights to each destination,
dist = mean (distance), # mean distance flown to each destination,
delay = mean (arr_delay), # mean delay of arrival to each destination,
delaycost= mean(count*delay/distance)) # delay cost index defined as:
# [(number of flights)*delay/distance] for a destination
top100 <- delays %>% # select the top 100 largest arrival delays
arrange(desc(delaycost)) %>%
head(100)
row.names(top100) <- top100$dest
Setting row names on a tibble is deprecated.
delays_mat <- data.matrix(top100) # convert delays dataframe to a matrix (required by heatmap)
delays_matrix <- delays_mat[,2:5] # remove redundant column of destination airport codes
varcols = setNames(colorRampPalette(brewer.pal(nrow(delays_matrix), "YlGnBu"))(nrow(delays_matrix)), rownames(delays_matrix)) # parameter for RowSideColors
n too large, allowed maximum for palette YlGnBu is 9
Returning the palette you asked for with that many colors
heatmap(delays_matrix, Rowv = NA,
Colv = NA,
col= colorRampPalette(brewer.pal(nrow(delays_matrix), "YlGnBu"))(nrow(delays_matrix)),
s=0.6, v=1, scale="column",
margins=c(7,10),
main = "Cost of Late Arrivals",
xlab =" Flight Characteristics",
ylab="Arrival Airport",
labCol = c("Flights","Distance","Delay","Cost Index"),
cexCol=1, cexRow =1, RowSideColors = varcols)
layout: widths = 0.05 0.2 4 , heights = 0.25 4 ; lmat=
[,1] [,2] [,3]
[1,] 0 0 4
[2,] 3 1 2
Cost index is a measure of how much the cost to fly to an airport increases due to frequent delays of arrival. It is roughly proportional to flights and inversely proportional to distance because delays affect shorter flights more than longer ones, and the profit per seat increases with distance. The variance in delays is due to airport congestion and regional weather.
Streamgraphs display the changes in data over time of different categories. The size of each individual stream shape is proportional to the values in each category. Colors can be used to distinguish streams or to represent an additional dimension in the data. They are well-suited to displaying high- volume datasets. The downside to streamgraphs is that they can be difficult to read, be cluttered, and the categories with smaller values may be drowned out. It is also not possible to read the exact values visualized from the plot.
devtools::install_github("hrbrmstr/streamgraph")
Skipping install of 'streamgraph' from a github remote, the SHA1 (76f7173e) has not changed since last install.
Use `force = TRUE` to force installation
library(dplyr)
library(streamgraph)
Registered S3 method overwritten by 'htmlwidgets':
method from
print.htmlwidget tools:rstudio
library(babynames)
# Create data:
year=rep(seq(1990,2016) , each=10)
name=rep(letters[1:10] , 27)
value=sample( seq(0,1,0.0001) , length(year))
data=data.frame(year, name, value)
# Basic stream graph: just give the 3 arguments
streamgraph(data, key="name", value="value", date="year")
streamgraph_html returned an object of class `list` instead of a `shiny.tag`.streamgraph_html returned an object of class `list` instead of a `shiny.tag`.
lets look at the babynames dataset:
ncol(babynames)
[1] 5
str(babynames)
tibble [1,924,665 × 5] (S3: tbl_df/tbl/data.frame)
$ year: num [1:1924665] 1880 1880 1880 1880 1880 1880 1880 1880 1880 1880 ...
$ sex : chr [1:1924665] "F" "F" "F" "F" ...
$ name: chr [1:1924665] "Mary" "Anna" "Emma" "Elizabeth" ...
$ n : int [1:1924665] 7065 2604 2003 1939 1746 1578 1472 1414 1320 1288 ...
$ prop: num [1:1924665] 0.0724 0.0267 0.0205 0.0199 0.0179 ...
babynames %>%
filter(grepl("^Kr", name)) %>%
group_by(year, name) %>%
tally(wt=n) %>%
streamgraph("name", "n", "year")
streamgraph_html returned an object of class `list` instead of a `shiny.tag`.streamgraph_html returned an object of class `list` instead of a `shiny.tag`.
# Streamgraphing Commercial Real Estate Transaction Volume by Asset Class Since 2001
dat <- read.csv("http://asbcllc.com/blog/2015/february/cre_stream_graph_test/data/cre_transaction-data.csv")
dat %>%
streamgraph("asset_class", "volume_billions", "year", interpolate="cardinal") %>%
sg_axis_x(1, "year", "%Y") %>%
sg_fill_brewer("PuOr")
streamgraph_html returned an object of class `list` instead of a `shiny.tag`.streamgraph_html returned an object of class `list` instead of a `shiny.tag`.
library(alluvial)
set.seed(39) # for nice colours
cols <- hsv(h = sample(1:10/10), s = sample(3:12)/15, v = sample(3:12)/15) # creates the vector of 10 colors
alluvial_ts(Refugees, wave = .3, ygap = 5, col = cols, plotdir = 'centred', alpha=.9,
grid = TRUE, grid.lwd = 5, xmargin = 0.2, lab.cex = .7, xlab = '',
ylab = '', border = NA, axis.cex = .8, leg.cex = .7,
leg.col='white',
title = "UNHCR-recognised refugees\nTop 10 countries (2003-2013)\n")