package dplyrAssist

Keon-Woong Moon

2017-09-03

The 'dplyrAssist' is an RStudio addin for teaching and learning data manipulation using the 'dplyr' package. You can learn each steps of data manipulation by clicking your mouse without coding. You can get resultant data(as a 'tibble') and the code for data manipulation. 

Install package

You can install dplyrAssist package from github.

#install.packages("devtools")
devtools::install_github("cardiomoon/dplyrAssist")

The First Example: Reshaping Data

require(tidyverse)
require(dplyrAssist)

You can run dplyrAssist() function without data.

dplyrAssist()

Or you can run as an RStudio addin.

Bt default, tidyr::table1 data is displayed. Press Show data structure radio button(1) and you can see the data(2).

You can reshape the data easily. Select “Reshaping Data”(1) and select “gather” function(2). You can see the plot explaining “gather”(3). Select case and population(4). Enter the key column name(5) and value column name(6). You can see the R code(7). Press Add R code button(8).

You can see the Data Wrangling R codes(1) and the result(2).

The reverse prcoess of gather is spread. Now select spread function(2). You can see the plot explaining “spread”(3). Select key and value columns(4). You can see the R code(5). Press Add R code button(6).

You can see the Data Wrangling R codes(1) and the result(2).

The 2nd example: Summarise Data

You can run dplyrAssist function with data name.

result <- dplyrAssist(iris)

A shiny app appeared. Select Group Data(1) and select group_by function(2) and click the Species column(3).

You can see the R code(1). You can edit this code. Press Add R code(2) button.

You can the Data Wrangling R Code(1) and the result of R code(2).

You can add R code(s) as much as you want. Select Summarise Data(1) and summarise_all function(2). Insert mean to complete the R code(3). Press Add R code button(4).

You can see the Data Wrangling R codes(1) and the result(2). If you want to save the resultant data, Press Save & exit button(3).

In R console, you can see the result with the following code.

result
# A tibble: 3 x 5
     Species Sepal.Length Sepal.Width Petal.Length Petal.Width
      <fctr>        <dbl>       <dbl>        <dbl>       <dbl>
1     setosa        5.006       3.428        1.462       0.246
2 versicolor        5.936       2.770        4.260       1.326
3  virginica        6.588       2.974        5.552       2.026
cat(attr(result,"code"))
iris %>%
 group_by(Species) %>%
 summarise_all(mean)

This is identical with the following codes.

result<-iris %>% 
     group_by(Species) %>%
     summarise_all(mean)
attr(result,"code") <- "iris %>%\n group_by(Species) %>%\n summarise_all(mean)"

The 3rd example : Combine the data sets

You can join datas easily with dplyr language.

result<-dplyrAssist(band_members,band_instruments)

When you turn on the Show the 2nd Data switch(1), you can see the name of second data(2). You can edit the name of data. Press Show data structure radio button(3) and you can see the 2nd data(4).

Select Combine Data Sets(1) and select left_join function(2). You can see the R code(3) (Of course, you can edit it!) and you can add the R code by presssing the Add R code button(4).

You can see the R code(1) for data wrangling and the result(2).

The 4th Example

In this example, you need the flights and airpors data from the package nycflights13.

require(nycflights13)

result <- dplyrAssist()

Enter Data name as flights(1). You can see the data by pressing the data structure button(2). After turn on the switch(3), enter the second data nae as airports(4). You can see the 2nd data also by pressing the data structure button(5).

Because the origin and destination airport name is recorded in origin and dest column in flights data. If you want to join the flights and airports data by dest column in flights data and faa column in airports data, select Combine data sets(1) and left_join function(2). Select dest column(3) and select faa in the y.b selectbox(4). The R code for join is ready for you(5). You can add the R code by pressing the Add Rcode button(6).

You can see the R code for data combining(1) and the result(2).