program 1

Author

Atharsh

1)Develop an R program to quickly explore a given data set,including categorical analysis using the group by command and visuallize the findings using ggplot 2 features

Step1:Load necessary libraries

library(tidyverse)
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.1.4     ✔ readr     2.1.5
✔ forcats   1.0.0     ✔ stringr   1.5.1
✔ ggplot2   3.5.1     ✔ tibble    3.2.1
✔ lubridate 1.9.4     ✔ tidyr     1.3.1
✔ purrr     1.0.4     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(dplyr)
library(ggplot2)

Step 2:Load the dataset

data<-mtcars
data$cycl <- as.factor(data$cyl)

Step 3:Group by categorical variables

summary_data <-data %>%
  group_by(cyl) %>%
  summarise(avg_mpg=mean(mpg), .groups='drop')
print(summary_data)
# A tibble: 3 × 2
    cyl avg_mpg
  <dbl>   <dbl>
1     4    26.7
2     6    19.7
3     8    15.1

Step 4 Visualizing the findings

ggplot(summary_data,aes(x=cyl,y=avg_mpg,fill=cyl)) +
geom_bar(stat="identity") +
labs(title = "Average MPG by cyclinder count",
     x="Number of Cylinders",
     y="Average MPG") +
  theme_minimal()