This assignment is aimed at practicing ggplot and gain experience with visualization with ggplot, objective for this assignment will be to write the code necessary to exactly recreate the provided graphics.
—“Experts from datascienceDojo”
ggplot2 became defacto standard for visualization in R. It helps to design print grade graphs on the fly. User has fine grain control over layering graphical elements while building visualization.
Every visualization in ggplot2 is composed of: 1. Data 2. Layers 3. Scales 4. Coordinates 5. Faceting 6. Themes
Data - Main ingredient of visualization [!required]
Layers - what you see on the plots
Scales - mapping between data and output
Coordinates - Visualization perspective
Faceting - Visual drill-down into data
Themes - COntrols the details of display
Working with grammer of graphics:
ggplot2 visualization has three required components:
if(!require(ggplot2)) {install.packages("ggplot2")}
## Loading required package: ggplot2
if(!require(ggthemes)) {install.packages("ggthemes")}
## Loading required package: ggthemes
str(mpg)
## Classes 'tbl_df', 'tbl' and 'data.frame': 234 obs. of 11 variables:
## $ manufacturer: chr "audi" "audi" "audi" "audi" ...
## $ model : chr "a4" "a4" "a4" "a4" ...
## $ displ : num 1.8 1.8 2 2 2.8 2.8 3.1 1.8 1.8 2 ...
## $ year : int 1999 1999 2008 2008 1999 1999 2008 1999 1999 2008 ...
## $ cyl : int 4 4 4 4 6 6 6 4 4 4 ...
## $ trans : chr "auto(l5)" "manual(m5)" "manual(m6)" "auto(av)" ...
## $ drv : chr "f" "f" "f" "f" ...
## $ cty : int 18 21 20 21 16 18 18 18 16 20 ...
## $ hwy : int 29 29 31 30 26 26 27 26 25 28 ...
## $ fl : chr "p" "p" "p" "p" ...
## $ class : chr "compact" "compact" "compact" "compact" ...
This graphic is a traditional stacked bar chart. This graphic works on the mpg dataset, which is built into the ggplot2 library.
mpg$trans <- as.factor(mpg$trans)
ggplot(mpg, aes(class, fill = trans))+ # Data and aesthetics [x =axis is "class"]
geom_bar()+ # layer, barplot
labs(fill = "Transmission") # Label
This is a boxplot built using the mpg dataset.
mpg$manufacturer <- as.factor(mpg$manufacturer)
ggplot(mpg,aes(manufacturer,hwy))+ #Data and aesthetics
theme_classic()+ #Theme
geom_boxplot()+ # Layer
coord_flip()+ # Coordinates
labs(y="Highway Fuel Efficiency(miles/gallon) ",x="Vehicle Manufacturer") #Labels
ggplot(diamonds, aes(x=price,color=cut,fill=cut)) + # Data and aesthetics
theme_economist() + # economist background theme
geom_density(aes(fill=factor(cut)), alpha=0.25) + # Transparent Density plot
labs(x="Diamond Price (USD)", y="Density", title="Diamond Price Density ") # Labels
ggplot(iris, aes(x = Sepal.Length, y = Petal.Length)) + # Data and aesthetics
theme_bw() + #white background theme
theme(panel.border = element_blank(), panel.grid.major.x = element_line('lightgrey'), panel.grid.minor.x = element_line('grey'), panel.grid.major.y = element_line('lightgrey'), panel.grid.minor.y = element_line('grey'))+
geom_point() + # Scatterplot
geom_smooth(method = "lm") + # curve fitting , linear regression
labs(x="Iris Sepal Length", y="Iris Petal Length", title = "Relationship between Petal and Sepal Length") # Labels
ggplot(data=iris, aes(x = Sepal.Length, y = Petal.Length, color=Species)) + # Data and aesthetics
theme_bw() + # White background
theme(legend.position="bottom", panel.border = element_blank(), panel.grid.minor = element_blank(),panel.grid.major = element_blank()) + # Removing background grid and border
geom_point() + # Scatter plot
geom_smooth(method=lm, se = F, stat = "smooth") + # Curve fitting
labs(x="Iris Sepal Length", y="Iris Petal Length", title = "Relationship between Petal and Sepal Length", subtitle = "Species level comparison")