2021-03-12

Introduction

This document describes the preliminary steps of reflexion related to the definition of a Shiny application for the Johns Hopkins course “building data products” final project.

As requested in the instructions, the application will be kept simple.
Using one of the “sand box” dataset available in R, the interface will be focussed on being a companion to EDA (Exploratory Data Analysis)

  • For performance & interactivity, the dataset will need to be kept simple (ie. 100 lines max).
  • As a standard view, a plot will display the relationship between 2 variables (x vs y) available in the dataset. The user will be invited which variables shall be selected as “x” and which variable shall be selected as “y” thanks to the selectInput() object in the UI.
  • A second step, the user will be invited to change the color of the plot with a set of radio buttons radioButtons().

Dataset selection

I decided to use one the pre-loaded data available in R (I used the data() command to get the list of the available datasets).
I decided to select the “trees” dataset which shows the relationship between Diameter, Height and Volume for Black Cherry Trees.
The structure of the “trees” dataset is given below :

'data.frame':   31 obs. of  3 variables:
 $ Girth : num  8.3 8.6 8.8 10.5 10.7 10.8 11 11 11.1 11.2 ...
 $ Height: num  70 65 63 72 81 83 66 75 80 75 ...
 $ Volume: num  10.3 10.3 10.2 16.4 18.8 19.7 15.6 18.2 22.6 19.9 ...

Example of scatter plot

The Shiny Application will allow the user to perform some EDA of the “trees” dataset.
A first list will allow the user to pick the variable to be displayed on the “x” axis ; a second list will offer the user to pick the variable to be selected on the “y” axis ; Girth, Height or Volume.
One example of plot which could be obtained is showed below

Instructions

In order to keep the ergonomy of the application “simple”, the user instructions (explaining how the app works) will be displayed on a dedicated tab (instructions tab), located on the sidebar.
User actions will be placed in the “Main” Tab of the application.
The plot will be shown on the main panel of the “Main” Tab.

Issues & improvements

I spent some time figuring out why my server code was crashing but I finally found that I wasn’t using properly some variables defined in the UI (forgot to used “input$” as a prefix).

Finally, I made some improvements in the interface by allowing the user to set the scatter plot points size according to his own preference (offering the possibility to select a value from 1 to 5 thanks to a slider object).

Further improvements could be possible, ie. add a linear regression line to the plot & display the equation of the linear model.