Manipulasi data merupakan salah satu proses untuk mengubah data menjadi lebih mudah untuk dibaca dan lebih terorganisir. Sebagai contoh para data analis dibidang sosial seperti akutansi atau sejenisnya sering menggunakan proses manipulasi untuk mengetahui harga dari sebuah produk, tren dari penjualan, hingga potensi kewajiban pajak.
library(readxl)
## Warning: package 'readxl' was built under R version 4.1.2
dataoutflow <- read_excel(path = "outflowperbulanSumatra.xlsx")
dataoutflow
library(tidyverse)
## Warning: package 'tidyverse' was built under R version 4.1.2
## -- Attaching packages --------------------------------------- tidyverse 1.3.1 --
## v ggplot2 3.3.5 v purrr 0.3.4
## v tibble 3.1.6 v dplyr 1.0.8
## v tidyr 1.2.0 v stringr 1.4.0
## v readr 2.1.2 v forcats 0.5.1
## Warning: package 'ggplot2' was built under R version 4.1.2
## Warning: package 'tibble' was built under R version 4.1.2
## Warning: package 'tidyr' was built under R version 4.1.2
## Warning: package 'readr' was built under R version 4.1.2
## Warning: package 'purrr' was built under R version 4.1.2
## Warning: package 'dplyr' was built under R version 4.1.2
## Warning: package 'forcats' was built under R version 4.1.2
## -- Conflicts ------------------------------------------ tidyverse_conflicts() --
## x dplyr::filter() masks stats::filter()
## x dplyr::lag() masks stats::lag()
BulMaret <- select(dataoutflow,'Provinsi', 'Maret')
BulMaret
notMaret<- select(dataoutflow, -'Maret')
notMaret
JanMar <- dataoutflow %>% select('Provinsi', 'Januari', 'Februari', 'Maret')
JanMar
rename <- dataoutflow %>% rename('Bulan April' = 'April')
head(rename)
sulawesiTenggara <- dataoutflow %>%
filter(Provinsi == 'Aceh') %>%
select( 'Provinsi', 'Agustus','September', 'Oktober',)
sulawesiTenggara
sup1 <- dataoutflow %>%
filter(Provinsi == 'Jambi') %>%
select( 'Provinsi', 'Januari', 'Februari','Maret', 'April',)
sup1
str(dataoutflow)
## tibble [11 x 13] (S3: tbl_df/tbl/data.frame)
## $ Provinsi : chr [1:11] "Sumatera" "Aceh" "Sumatera Utara" "Sumatera Barat" ...
## $ Januari : num [1:11] 4694 182 1456 102 740 ...
## $ Februari : num [1:11] 6959 426 2150 308 832 ...
## $ Maret : num [1:11] 12668 1434 3244 782 1264 ...
## $ April : num [1:11] 11776 1432 3371 819 1775 ...
## $ Mei : num [1:11] 19645 1690 4148 2242 2926 ...
## $ Juni : num [1:11] 3971.8 436 1473.5 34.1 282.8 ...
## $ Juli : num [1:11] 12710 1769 3526 651 1530 ...
## $ Agustus : num [1:11] 9744 456 3054 566 1470 ...
## $ September: num [1:11] 9247 830 2142 343 1394 ...
## $ Oktober : num [1:11] 14432 1175 3857 793 2018 ...
## $ November : num [1:11] 9435 774 2151 484 1409 ...
## $ Desember : num [1:11] 25307 2270 9185 1638 3498 ...
str(dataoutflow %>% group_by(Provinsi))
## grouped_df [11 x 13] (S3: grouped_df/tbl_df/tbl/data.frame)
## $ Provinsi : chr [1:11] "Sumatera" "Aceh" "Sumatera Utara" "Sumatera Barat" ...
## $ Januari : num [1:11] 4694 182 1456 102 740 ...
## $ Februari : num [1:11] 6959 426 2150 308 832 ...
## $ Maret : num [1:11] 12668 1434 3244 782 1264 ...
## $ April : num [1:11] 11776 1432 3371 819 1775 ...
## $ Mei : num [1:11] 19645 1690 4148 2242 2926 ...
## $ Juni : num [1:11] 3971.8 436 1473.5 34.1 282.8 ...
## $ Juli : num [1:11] 12710 1769 3526 651 1530 ...
## $ Agustus : num [1:11] 9744 456 3054 566 1470 ...
## $ September: num [1:11] 9247 830 2142 343 1394 ...
## $ Oktober : num [1:11] 14432 1175 3857 793 2018 ...
## $ November : num [1:11] 9435 774 2151 484 1409 ...
## $ Desember : num [1:11] 25307 2270 9185 1638 3498 ...
## - attr(*, "groups")= tibble [11 x 2] (S3: tbl_df/tbl/data.frame)
## ..$ Provinsi: chr [1:11] "Aceh" "Bengkulu" "Jambi" "Kep. Bangka Belitung" ...
## ..$ .rows : list<int> [1:11]
## .. ..$ : int 2
## .. ..$ : int 9
## .. ..$ : int 7
## .. ..$ : int 11
## .. ..$ : int 6
## .. ..$ : int 10
## .. ..$ : int 5
## .. ..$ : int 1
## .. ..$ : int 4
## .. ..$ : int 8
## .. ..$ : int 3
## .. ..@ ptype: int(0)
## ..- attr(*, ".drop")= logi TRUE
sup2 <- dataoutflow %>%
group_by(Provinsi)
sup2
dataoutflow %>%
filter(Provinsi == 'Jambi') %>%
count('Januari', sort = TRUE)
up1 <- dataoutflow %>%
mutate('Januari' = dataoutflow$'April'/2)
up1
ggplot(data = dataoutflow, mapping = aes(x = Provinsi, y = `Januari`)) +
geom_point()
This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see http://rmarkdown.rstudio.com.
When you click the Knit button a document will be generated that includes both content as well as the output of any embedded R code chunks within the document. You can embed an R code chunk like this:
summary(cars)
## speed dist
## Min. : 4.0 Min. : 2.00
## 1st Qu.:12.0 1st Qu.: 26.00
## Median :15.0 Median : 36.00
## Mean :15.4 Mean : 42.98
## 3rd Qu.:19.0 3rd Qu.: 56.00
## Max. :25.0 Max. :120.00
You can also embed plots, for example:
Note that the echo = FALSE parameter was added to the code chunk to prevent printing of the R code that generated the plot.