One of the very popular tool among data scientists is Tidyverse. It is a combination of lots of other powerful tools that makes the life of data scientists easy while doing the calculations and analysis. It is a collection of packages for preparing data, wrangling data and visualizing data. It was created by the team of Hadley Wickham.
Some of the tools that are very popular are:
library(dplyr)
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
head(women)
## height weight
## 1 58 115
## 2 59 117
## 3 60 120
## 4 61 123
## 5 62 126
## 6 63 129
women %>%
select(weight) %>%
summarise(avg_A=mean(weight))
## avg_A
## 1 136.7333
readr: Another great tool to solve the problem of parsing a flat file is readr. It improves the computation speed. syntax: read_delim(‘filename.csv’, delim=“,”)
ggplot2:
library(ggplot2)
ggplot(data = women) +
aes(y = height, x = weight) +
geom_point(data = women, colour = 'blue', size = 2) +
theme_minimal()
There are many other packages like
tidr() purr() forcats() tibble()
I haven’t used these much but dplyr(), readr(), ggplot() comes on handy. These helped me and my team members a lot while we were doing our weekly homeworks and the projects for DATA 621.
Truely, Tidyverse is a very helpful package for data scientits.
library(knitr)
knitr::include_graphics('https://raw.githubusercontent.com/maharjansudhan/DATA621/master/0001.jpg')
Reference:
https://www.analyticsvidhya.com/blog/2019/05/beginner-guide-tidyverse-most-powerful-collection-r-packages-data-science/ https://www.tidyverse.org/packages/