p1

Author

Akhilesh N R

Develop an R program to quickly explore a given data set, including categorical analysis using the group_by command , and visualize finding using ggplot2 feature.

Step-1 Load necessary libraries

library(tidyverse)
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.1.4     ✔ readr     2.1.5
✔ forcats   1.0.0     ✔ stringr   1.5.1
✔ ggplot2   3.5.1     ✔ tibble    3.2.1
✔ lubridate 1.9.4     ✔ tidyr     1.3.1
✔ purrr     1.0.4     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(dplyr)
library(ggplot2)

Step-2 Load necessary dataset

#load dataset
data <- mtcars
data$cyl <- as.factor(data$cyl)

Step-3 Group by categorical variables

summary_data<-data%>%
  group_by(cyl)%>%
  summarize(avg_mpg=mean (mpg), .group= 'drop')
print(summary_data)
# A tibble: 3 × 3
  cyl   avg_mpg .group
  <fct>   <dbl> <chr> 
1 4        26.7 drop  
2 6        19.7 drop  
3 8        15.1 drop  

Step-4 Visualizing the findings

ggplot(summary_data, aes(x=cyl, y=avg_mpg, fill=cyl))+
  geom_bar(stat= "identity")+
  labs(title = "average MPG by cylinder count",
 x = "number or cylinder",
 y = "average MPG")+
     theme_minimal()