Reference book - ggplot2: Elegant Graphics for Data Analysis

These geoms are the fundamental building blocks of ggplot2. They are useful in their own right, but are also used to construct more complex geoms. Most of these geoms are associated with a named plot: when that geom is used by itself in a plot, that plot has a special name.

Each of these geoms is two dimensional and requires both x and y aesthetics. All of them understand colour (or color) and size aesthetics, and the filled geoms (bar, tile and polygon) also understand fill.

geom_area() draws an area plot, which is a line plot filled to the y-axis (filled lines). Multiple groups will be stacked on top of each other.

geom_bar(stat = “identity”) makes a bar chart. We need stat = “identity” because the default stat automatically counts values (so is essentially a 1d geom, see Section 5.4). The identity stat leaves the data unchanged. Multiple bars in the same location will be stacked on top of one another.

geom_line() makes a line plot. The group aesthetic determines which observations are connected; see Chapter 4 for more detail. geom_line() connects points from left to right; geom_path() is similar but connects points in the order they appear in the data. Both geom_line() and geom_path() also understand the aesthetic linetype, which maps a categorical variable to solid, dotted and dashed lines.

geom_point() produces a scatterplot. geom_point() also understands the shape aesthetic.

geom_polygon() draws polygons, which are filled paths. Each vertex of the polygon requires a separate row in the data. It is often useful to merge a data frame of polygon coordinates with the data just prior to plotting. Chapter 6 illustrates this concept in more detail for map data.

geom_rect(), geom_tile() and geom_raster() draw rectangles. geom_rect() is parameterised by the four corners of the rectangle, xmin, ymin, xmax and ymax. geom_tile() is exactly the same, but parameterised by the center of the rect and its size, x, y, width and height. geom_raster() is a fast special case of geom_tile() used when all the tiles are the same size. .

geom_text() adds text to a plot. It requires a label aesthetic that provides the text to display, and has a number of parameters (angle, family, fontface, hjust and vjust) that control the appearance of the text.

Each geom is shown in the code below. Observe the different axis ranges for the bar, area and tile plots: these geoms take up space outside the range of the data, and so push the axes out.

library(ggplot2)

Create dataset

# Create vectors for each column
name <- c("Alice", "Bob", "Charlie", "David", "Eva", "Frank", "Grace", "Helen", "Ian", "Julia", "Kevin", "Lily", "Mike", "Nina", "Oscar")
age <- c(23, 25, 22, 24, 21, 26, 23, 22, 25, 24, 23, 22, 24, 25, 23)
sex <- c("F", "M", "M", "M", "F", "M", "F", "F", "M", "F", "M", "F", "M", "F", "M")
score <- c(88, 92, 85, 90, 87, 91, 89, 86, 93, 88, 90, 87, 92, 85, 89)
iq <- c(110, 115, 108, 112, 109, 117, 113, 111, 116, 110, 114, 109, 115, 108, 112)
Hobby <- c("Reading", "Football", "Music", "Chess", "Dancing", "Swimming", "Painting", "Cycling", "Gaming", "Cooking", "Hiking", "Photography", "Running", "Writing", "Fishing")
fav_game <- c("Chess", "Soccer", "Tennis", "Basketball", "Badminton", "Cricket", "Volleyball", "Baseball", "Hockey", "Golf", "Rugby", "Table Tennis", "Snooker", "Squash", "Poker")
fav_flower <- c("Rose", "Lily", "Tulip", "Daisy", "Orchid", "Sunflower", "Jasmine", "Lavender", "Peony", "Marigold", "Daffodil", "Iris", "Carnation", "Violet", "Lotus")

# Combine into a dataframe
df <- data.frame(
  name = name,
  age = age,
  sex = sex,
  score = score,
  iq = iq,
  Hobby = Hobby,
  fav_game = fav_game,
  fav_flower = fav_flower,
  stringsAsFactors = FALSE
)

# View the dataframe
print(df)
##       name age sex score  iq       Hobby     fav_game fav_flower
## 1    Alice  23   F    88 110     Reading        Chess       Rose
## 2      Bob  25   M    92 115    Football       Soccer       Lily
## 3  Charlie  22   M    85 108       Music       Tennis      Tulip
## 4    David  24   M    90 112       Chess   Basketball      Daisy
## 5      Eva  21   F    87 109     Dancing    Badminton     Orchid
## 6    Frank  26   M    91 117    Swimming      Cricket  Sunflower
## 7    Grace  23   F    89 113    Painting   Volleyball    Jasmine
## 8    Helen  22   F    86 111     Cycling     Baseball   Lavender
## 9      Ian  25   M    93 116      Gaming       Hockey      Peony
## 10   Julia  24   F    88 110     Cooking         Golf   Marigold
## 11   Kevin  23   M    90 114      Hiking        Rugby   Daffodil
## 12    Lily  22   F    87 109 Photography Table Tennis       Iris
## 13    Mike  24   M    92 115     Running      Snooker  Carnation
## 14    Nina  25   F    85 108     Writing       Squash     Violet
## 15   Oscar  23   M    89 112     Fishing        Poker      Lotus
s <- ggplot(df,aes(score,iq,label = name)) + labs(x = NULL, y = NULL) + 
  theme(plot.title = element_text(size = 12))

s + geom_point() + ggtitle("Scatter Plot")

s + geom_text(aes(label = name), vjust = -1) + ggtitle("Text")

s + geom_bar(stat = "identity") + ggtitle("Bar Chart")

s + geom_tile() + ggtitle("Raster")

s + geom_line() + ggtitle("Line Plot")

s + geom_area() + ggtitle("Area Plot")

s + geom_path() + ggtitle("Path")

s + geom_polygon() + ggtitle("Polygon")

ggplot(df,aes(sex,age)) + geom_bar(stat = "identity", fill = "blue")

ggplot(mpg, aes(class, fill = hwy)) + 
  geom_bar()
## Warning: The following aesthetics were dropped during statistical transformation: fill.
## ℹ This can happen when ggplot fails to infer the correct grouping structure in
##   the data.
## ℹ Did you forget to specify a `group` aesthetic or to convert a numerical
##   variable into a factor?

#> Warning: The following aesthetics were dropped during statistical transformation: fill.
#> ℹ This can happen when ggplot fails to infer the correct grouping structure in
#>   the data.
#> ℹ Did you forget to specify a `group` aesthetic or to convert a numerical
#>   variable into a factor?
ggplot(mpg, aes(class, fill = hwy, group = hwy)) + 
  geom_bar()

df$height <- 1  # Add a column for bar height

ggplot(df, aes(x = sex, y = height, fill = Hobby)) +
  geom_bar(stat = "identity") +
  geom_text(aes(label = name), position = position_stack(vjust = 0.5))

  ggplot(df, aes(score)) + 
    geom_histogram(fill = "green", bins = 10) +
    labs(title = "Histogram of Scores", x = "Score", y = "Count")

Box and Violin Plot

j <- ggplot(df,aes(sex,age))
j + geom_boxplot(fill = "skyblue")

j + geom_violin(fill = "skyblue")