library(ggplot2)
Program_04
Develop a script in R to produce a bar graph displaying the frequency distribution of categorical data in a given dataset, grouped by a specific variable, using ggplot2.
Step 1: Load the Datatset.
We use the built-in mtcars
dataset, which contains information about different car models.
<- mtcars
data head(data)
mpg cyl disp hp drat wt qsec vs am gear carb
Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4
Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4
Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1
Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1
Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2
Valiant 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1
Explanation
The
mtcars
dataset includes various car specifications.We will analyze the number of cylinders (
cyl
) and group by the number of gears (gear
).
Step 2: Convert Numeric Data to Categorical
Since cyl (no.of cyclinders) and gear (no.of gears) are numerical, we convert theminto factors
$cyl <- as.factor(data$cyl)
data$gear <-as.factor(data$gear) data
Why Convert to factors?
ggplot2 traets factors as categories, making it easy to group and visualize.
Step 3: Create a Bar Graph
We now create a bar plot to show the frequency distribution of cyl
, grouped by gear
.
ggplot(data, aes(x=cyl,fill=gear)) +
geom_bar(position = "dodge") +
labs(title = "Frequenncy of Cyclinders Grouped by Gear Type",
x = "number of cyclinder",
y = "count",
fill = "Gears") +
theme_minimal()