ggplot2 is one of the most widely used libraries for data visualization in R. It provides a powerful, flexible, and consistent way to create a wide range of static graphics. At its core, ggplot2 is based on the Grammar of Graphics, which conceptualizes a plot as a combination of several layers that are built on top of each other.This makes ggplot2 both intuitive and extremely flexible, allowing you to start with simple plots and gradually add complexity as needed.
This analysis and the corresponding visualizations revolve around the economics dataset from the ggplot2 package. The dataset tracks various economic indicators in the United States, such as unemployment rates, personal savings rates, personal consumption expenditures, and more, over a period of time. We’ll break down the analysis into key points and follow a story-like progression to explain what the data visualizations reveal.
library(ggplot2)
suppressWarnings(suppressMessages(library(ggplot2)))
#Data set from the package ggplot2
data(package="ggplot2")
#load the data
economics=ggplot2::economics
str(economics)
## spc_tbl_ [574 × 6] (S3: spec_tbl_df/tbl_df/tbl/data.frame)
## $ date : Date[1:574], format: "1967-07-01" "1967-08-01" ...
## $ pce : num [1:574] 507 510 516 512 517 ...
## $ pop : num [1:574] 198712 198911 199113 199311 199498 ...
## $ psavert : num [1:574] 12.6 12.6 11.9 12.9 12.8 11.8 11.7 12.3 11.7 12.3 ...
## $ uempmed : num [1:574] 4.5 4.7 4.6 4.9 4.7 4.8 5.1 4.5 4.1 4.6 ...
## $ unemploy: num [1:574] 2944 2945 2958 3143 3066 ...
#change char to factorial
economics$uempmed=as.factor(economics$uempmed)
n=nrow(economics)
area = seq(1,n) #alternate to 1:n is the function seq()
#Univariate (Analyzing a Single Variable)
# Histogram for unemployment rate
ggplot(economics, aes(x = unemploy)) +
geom_histogram(binwidth = 1000, fill = "skyblue", color = "black") +
labs(title = "Distribution of Unemployment Rate",
x = "Unemployed Persons",
y = "Frequency")
#densityplot for Personal saving rate
ggplot(economics, aes(x = psavert)) +
geom_density(fill = "green", alpha = 0.3) +
theme_minimal() +
labs(title = "Smoothed Distribution of Personal Savings Rate",
x = "Savings Rate",
y = "Density")
#Bivariate (Analyzing the Relationship Between Two Variables)
#scatterplot for Personal Savings Rate vs Unemployment
ggplot(economics, aes(x = psavert, y = unemploy)) +
geom_point(color = "red", alpha = 0.5) +
theme_minimal() +
labs(title = "Personal Savings Rate vs Unemployment",
x = "Savings Rate",
y = "Unemployed Persons")
#Lineplot for Personal Consumption Expenditures Over Time
ggplot(economics, aes(x = date, y = pce)) +
geom_line(color = "blue") +
theme_minimal() +
labs(title = "Personal Consumption Expenditures Over Time",
x = "Date",
y = "Personal Consumption Expenditures")
# Basic time series plot for unemployment rate over time
ggplot(economics, aes(x = date, y = unemploy)) +
geom_line(color = "blue") +
theme_minimal() +
labs(title = "Unemployment Rate Over Time",
x = "Date",
y = "Unemployed Persons")
# Scatter plot for personal savings rate vs. unemployment
ggplot(economics, aes(x = psavert, y = unemploy)) +
geom_point(color = "red", alpha = 0.5) +
theme_minimal() +
labs(title = "Personal Savings Rate vs. Unemployment",
x = "Personal Savings Rate",
y = "Unemployed Persons")
# Histogram for unemployment rate distribution
ggplot(economics, aes(x = unemploy)) +
geom_histogram(binwidth = 1000, fill = "green", color = "black") +
theme_minimal() +
labs(title = "Distribution of Unemployment Rate",
x = "Unemployed Persons",
y = "Frequency")
# Densityplot for unemployment rate distribution
ggplot(economics, aes(x = unemploy)) +
geom_density(fill = "purple", alpha = 0.5) +
theme_minimal() +
labs(title = "Density of Unemployment Rate",
x = "Unemployed Persons",
y = "Density")
These visualizations show how the U.S. economy has changed over time, focusing on unemployment, savings, and spending. They highlight cycles of economic growth and recession, with higher unemployment linked to lower savings. Understanding these trends helps policymakers predict economic shifts and make better decisions for growth and stability.