Program 04

Author

Manjunath 1NT24IS412

Develop a script in R to produce a bar graph displaying the frequency distribution of categorical data in a given dataset, grouped by a specific variable, using ggplot2.

# Load necessary libraries
library(ggplot2)

Step 1: Load the Dataset.

We use the built-in mtcars dataset, which contains information about different car models.

# Load dataset
data <- mtcars

# Display Data few rows
head(data)
                   mpg cyl disp  hp drat    wt  qsec vs am gear carb
Mazda RX4         21.0   6  160 110 3.90 2.620 16.46  0  1    4    4
Mazda RX4 Wag     21.0   6  160 110 3.90 2.875 17.02  0  1    4    4
Datsun 710        22.8   4  108  93 3.85 2.320 18.61  1  1    4    1
Hornet 4 Drive    21.4   6  258 110 3.08 3.215 19.44  1  0    3    1
Hornet Sportabout 18.7   8  360 175 3.15 3.440 17.02  0  0    3    2
Valiant           18.1   6  225 105 2.76 3.460 20.22  1  0    3    1

Explanation

  • The mtcars dataset includes various car specifications.

  • We will analyze the number of cylinders (cyl) and group by the number of gears (gear).

Step 2: Convert Numeric Data to Categorical.

Since cyl (number of cylinders) and gear (number of gears) are numerical, we convert them into factors.

data$cyl <- as.factor(data$cyl)
data$gear <- as.factor(data$gear)

Why Convert to Factors?

  • ggplot2 treats factors as categories, making it easy to group and visualize.

Step 3: Create a Bar Graph

We now create a bar plot to show the frequency distribution of cyl, grouped by gear.

#Create a Bar graph
ggplot(data, aes (x = cyl, fill = gear)) +
  geom_bar(position = "dodge") +
  labs(title="Frequency of Cylinder grouped by Gear type",
       x="Number of Cylinders",
       y="Count",
       fill="Gears") + # Legend title
  theme_minimal()

Explanation of the Plot

X-Axis (cyl)

  • Displays cylinder categories (4, 6, 8 cylinders).

Y-Axis (Frequency Count)

  • Represents the number of cars in each category.

Color Fill (gear)

  • Differentiates cars based on number of gears (3, 4, 5 gears).

Grouped Bars (position = "dodge")

  • Ensures bars are side by side instead of stacked.

Minimal Theme (theme_minimal())

  • Provides a clean and readable layout.