The 'dplyrAssist' is an RStudio addin for teaching and learning data manipulation using the 'dplyr' package. You can learn each steps of data manipulation by clicking your mouse without coding. You can get resultant data(as a 'tibble') and the code for data manipulation.
You can install dplyrAssist
package from github.
#install.packages("devtools")
devtools::install_github("cardiomoon/dplyrAssist")
require(tidyverse)
require(dplyrAssist)
You can run dplyrAssist() function without data.
dplyrAssist()
Or you can run as an RStudio addin.
Bt default, tidyr::table1 data is displayed. Press Show data structure
radio button(1) and you can see the data(2).
You can reshape the data easily. Select “Reshaping Data”(1) and select “gather” function(2). You can see the plot explaining “gather”(3). Select case
and population
(4). Enter the key column name(5) and value column name(6). You can see the R code(7). Press Add R code
button(8).
You can see the Data Wrangling R codes(1) and the result(2).
The reverse prcoess of gather
is spread
. Now select spread
function(2). You can see the plot explaining “spread”(3). Select key
and value
columns(4). You can see the R code(5). Press Add R code
button(6).
You can see the Data Wrangling R codes(1) and the result(2).
You can run dplyrAssist function with data name.
result <- dplyrAssist(iris)
A shiny app appeared. Select Group Data
(1) and select group_by
function(2) and click the Species
column(3).
You can see the R code(1). You can edit this code. Press Add R code
(2) button.
You can the Data Wrangling R Code(1) and the result of R code(2).
You can add R code(s) as much as you want. Select Summarise Data
(1) and summarise_all
function(2). Insert mean
to complete the R code(3). Press Add R code
button(4).
You can see the Data Wrangling R codes(1) and the result(2). If you want to save the resultant data, Press Save & exit
button(3).
In R console, you can see the result with the following code.
result
# A tibble: 3 x 5
Species Sepal.Length Sepal.Width Petal.Length Petal.Width
<fctr> <dbl> <dbl> <dbl> <dbl>
1 setosa 5.006 3.428 1.462 0.246
2 versicolor 5.936 2.770 4.260 1.326
3 virginica 6.588 2.974 5.552 2.026
cat(attr(result,"code"))
iris %>%
group_by(Species) %>%
summarise_all(mean)
This is identical with the following codes.
result<-iris %>%
group_by(Species) %>%
summarise_all(mean)
attr(result,"code") <- "iris %>%\n group_by(Species) %>%\n summarise_all(mean)"
You can join datas easily with dplyr language.
result<-dplyrAssist(band_members,band_instruments)
When you turn on the Show the 2nd Data
switch(1), you can see the name of second data(2). You can edit the name of data. Press Show data structure
radio button(3) and you can see the 2nd data(4).
Select Combine Data Sets
(1) and select left_join
function(2). You can see the R code(3) (Of course, you can edit it!) and you can add the R code by presssing the Add R code button
(4).
You can see the R code(1) for data wrangling and the result(2).
In this example, you need the flights
and airpors
data from the package nycflights13
.
require(nycflights13)
result <- dplyrAssist()
Enter Data name as flights
(1). You can see the data by pressing the data structure
button(2). After turn on the switch(3), enter the second data nae as airports
(4). You can see the 2nd data also by pressing the data structure
button(5).
Because the origin and destination airport name is recorded in origin
and dest
column in flights data. If you want to join the flights and airports data by dest
column in flights
data and faa
column in airports
data, select Combine data sets
(1) and left_join
function(2). Select dest
column(3) and select faa
in the y.b selectbox(4). The R code for join is ready for you(5). You can add the R code by pressing the Add Rcode button(6).
You can see the R code for data combining(1) and the result(2).