This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
You are free to:
- Share - copy and redistribute the material in any medium or format
- Adapt - remix, transform, and build upon the material
Version 1.1.1 - March 2024

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
You are free to:
Data Visualization: ggplot2 is a powerful data visualization package in R that allows you to create elegant and informative plots. It uses a grammar of graphics approach, which means that you can build up a plot in layers.
Aesthetic Mapping: The first step in creating a plot with ggplot2 is to specify the data and aesthetic mappings. This is done using the ggplot() function, where you can map variables to aesthetics such as x and y positions, color, size, and shape.
Geometric Objects (Geoms): After specifying the data and aesthetics, you can add geometric objects (geoms) to the plot. Geoms are the types of plot elements, such as points, lines, bars, and boxes. For example, geom_point() adds a scatter plot, geom_line() adds a line plot, and geom_bar() adds a bar plot.
Statistical Transformations (Stats): ggplot2 also provides statistical transformations (stats) that can be added to the plot. Stats calculate new variables based on the data and aesthetics. For example, stat_smooth() adds a smoothed line to the plot, and stat_bin() creates a histogram.
Plot is composed of:
Any ggplot2 plot has three key components:
Here is the data used for this exercise:
series <- data.frame( i = 1:10, linear = 1:10, fibonacci = c(1,1,2,3,5,8,13,21,34,55), square = (1:10)^2, log = log(1:10) )
and the packages uses are
ggplot2
tidyverse
ggplot(series, aes(x=i,y=fibonacci))+geom_point()
ggplot(series, aes(x=i,y=fibonacci))+geom_point()
series : defines the data to be usedaes(x=i,y=fibonacci) : maps data to visual characteristics
i and fibonacci to the x and y coordinates respectivelygeom_point() : defines a layer that maps data to points
Both scale and coordinates have (implicit) defaults:
coord_cartesian()
coord_polar()ggplot(series, aes(x=factor(i),y=fibonacci))+geom_point()
A factor is mapped to equidistanced slots along the axis
ggplot(series, aes(x=i,y=linear))+geom_point()+
coord_polar()
x maps to \(\theta\) (with max(x) \(\rightarrow 2\pi\)) and y maps to \(\rho\) (distance from center)
ggplot(series, aes(x=i,y=square))+geom_point()+
scale_y_log10(minor_breaks=c(1:10,1:10*10))
Applied a log scale to the position y
Aesthetics include:
x, y)group)color : line or simbol colorfill : area fill colorshape : type of shapesize : size of the objectggplot(series,
aes(x=i, y=fibonacci, color=fibonacci))+ geom_point()
A gradient scale is used for a continuous (numeric) variable
ggplot(series%>%mutate( mag = fibonacci %/% 10),
aes(x=i, y=fibonacci, color=factor(mag)))+ geom_point()
Discrete color scale is used for a factor variable
For each aesthetics type a few scale functions are provided:
scale_x_.., scale_y_..scale_color_..scale_fill_..scale_shape_..scale_size_..ggplot(series%>%mutate( mag = fibonacci %/% 10),
aes(x=i, y=fibonacci, color=mag))+
scale_color_gradient(low="blue",high="gold")+
geom_point()
Geometry function add new layers
geom_point() : draw pointsgeom_col() : draw a bar/columngeom_line() : draw lines connecting positionsgeom_text() and geom_label() : write a text or labelgeom_area() : draw a filled areaLayers are drawn in order of declaration, with the latest on top.
The order of all other statements is irrelevant.
ggplot(series, aes(x=i, y=fibonacci))+
geom_col()
ggplot(series, aes(x=i, y=fibonacci))+
geom_line()
ggplot(series, aes(x=i, y=fibonacci, label=fibonacci))+
geom_line() + geom_label()
A few geometries perform a transformation befor mapping to an object
geom_bar() : compute frequencies of discrete variablesgeom_histogram() : compute frequencies of bins of continuous varsgeom_boxplot() : compute boxplotgeom_violin(): compute a violin plotdata7 <- data.frame(category = c("A", "B", "C", "D"),
frequency = c(25, 20, 15, 30))
ggplot(data7, aes(x = category, y = frequency)) +
geom_bar(stat = "identity")
data6 = data.frame(age=c(8, 8, 12, 9, 11, 10, 12, 10, 9, 12, 11, 9, 10, 10, 11, 12, 10, 11, 12, 8, 9, 11, 10, 11, 11, 11, 9, 10, 11, 11, 10, 9, 10, 11, 10, 12, 10, 12, 10, 9, 10, 12, 11, 10, 9, 11, 11, 10, 9, 10))
ggplot(data6, aes(x = age)) + geom_histogram(binwidth = 1,
fill = "skyblue", color = "black") +
labs(title = "Histogram of Values",x = "Age",y = "Frequency") +
theme_minimal()
ggplot(series, aes(x=fibonacci))+
geom_boxplot()
The support elements and default visual features are defined by a theme
theme_classic() : similar to base functionstheme_gray() : the default theme (gray background)theme_bw() : same as default but with white backgoundtheme_light() : same as bw but with lighter linestheme_dark() : dark gray backgroundtheme_minimal() : minimalistic themetheme_void() : no supporting elementsggplot(series, aes(x=factor(i),y=fibonacci))+geom_point()+
theme_minimal()
The default theme can be changed with theme_set().