1 Algoritma

Algoritma is a data science education center based in Jakarta. We organize workshops and training programs to help working professionals and students gain mastery in various data science sub-fields: data visualization, machine learning, data modeling, statistical inference etc. Visit our website for all upcoming workshops.

2 Data Visualisation Capstone Project

After having learned and explored appropriate techniques on visualizing data, students are required to deploy an interactive dashboard web application using a shiny server which contain any plotting objects such as ggplot and/or leaflet that display useful insights.

2.1 First Objective

Before you making the dashboard, let’s answer this question first to help you creating a dashboard with useful insight.

2.1.1 What

What is the dashboard about?

This question is self explanatory, you should know what is it about, what problem you try to solve with this dashboard, what story you try to tell to your audience.

2.1.2 Who

Who is the user of your dashboard?

Knowing the user of your dashboard is very important. What division or what kind of people using this dashboard. Do you need a detail or more practical dashboard? When your user is on operational level you need a detailed dashboard but when your user is on managerial level you need simple and general dashboard that can convey the insight quickly.

2.1.3 Why

Why you choose that data?

How much your understanding of that data, Is that data can solve your question? Why do you choose that variable, are they really corelated? Important to know why your choose that data so you don’t create a misleading insight which very dangerous.

2.1.4 When

When is the data collected?

Is it still relevant? For example You can’t use the data from 80s to describe how’s the traffic at current date. Since the trend is everchanging so does the answer to your question, can those very old data answer your question? Irrelevant data can create a misleading insight.

2.1.5 Where

Where you put your plot, valuebox, or input etc?

Make a simple layout design, so you have a image how your end product will look like. Is it tidy enough? Easy enough for your user to understand it? Always follow 5 seconds rule. Your dashboard should provide the relevant information in about 5 seconds.

2.1.6 How

How your dashboard answer your question, hypothesis, or problem you try to solve?

Are you using a right plot? A right variable? Always start from your problem, make sure you use a right plot for right problem. For example what plot you use for see your data distribution? Are you using density plot or line plot?

2.2 Rubrics

In addition, students are given the freedom to use their own dataset or past datasets from previous classes. Below are the rubrics for assessment and grading, Students will get the point(s) if they :

2.2.1 Input (reactivity)

  • (2 points) Using min. 2 different input type

The dashboard should contain at least 2 different input types. For example, there is a slider input and select input on the dashboard. If the dashboard only has 2 select inputs or 2 similar inputs, it will be counted as one.

  • (2 points) Choosing appropriate input type

All input should have an appropriate user interface. For example, a date should not use a select input but instead use the proper date input or date range input.

  • (2 points) Demonstrating useful input(s)

The dashboard should have input widgets that would give the user the ability to explore the data. Some useful input including filtering data with slider input or selecting different categories with select input. Less useful input including changing the color of the plot.

2.2.2 Tab (paging)

  • (2 points) Using min. 3 page

The dashboard should contain at least 3 different tabs/page that convey different information.

2.2.3 Render plot

  • (1 points) Using interactive plot

All plots have to be presented as interactive plots.

  • (2 points) Using min. 2 plot type

The dashboard should contain at least 2 different plot types. We expect you can explore different visualizations to convey different information. For example, a dashboard contains 2 different plots: a bar chart and a line plot. The number of plots itself is not limited.

  • (2 points) Choosing the appropriate plot type

All information should be presented with appropriate plots. For example, if you want to show categorical ranking, you can use bar chart or lollipop chart. You can refer to data-to-viz for guidance.

  • (2 points) Demonstrating reactivity from the input

The plot should be able to react to the change given by input.

  • (2 points) Creating plots that tell a clear story

All plots should have clear information and are easy to understand. There should be at least a plot title and clear axis title. You can refer to this notes regarding this problem.

2.2.4 Deploy

  • (5 points) Successfully deploying to shinyapps.io

The dashboard should be deployed to shinyapps.io and contain no error after being deployed.

2.2.5 User Interface Appearance

  • (2 points) Have tidy page layout

We don’t expect you to be a great UI designer. However, your dashboard page should be tidy and clean enough to watch. Some considerations to help you create a tidy page including:

  1. Each tab/page should not too long and has distinct topics
  2. Be consistent, e.g. all texts are in English and consistent color themes
  3. Page should be filled thoroughly with content without leaving any blank spaces in a full-width length of a page.
Less Tidy Page

Tidy Page

  • (2 points) Have tidy plot layout

Some considerations to help you create a tidy plot including:

  1. Have clear plot title and axis title
  2. Text is readable and contrast with the background (if background is in dark color, the text should be in light color, and vice versa)
  3. Numeric axis text should be formatted such that every 3 digit is separated by comma (20,000 instead of 20000)
  4. Ranking is presented with clear order (ascending or descending), for example when you present top 10 product name or top 10 customer
  5. Have consistent themes for all plots
  6. No overlapping axis text, long text should not be rotated with 90% degree rotation. Some tips

  • (1 points) Have appropriate plot tooltip

The plot should have a customized tooltip that has a clear and easy to read popup text.

Bad Tooltip

Bad tooltip typically use the original column name that sometimes are hard to read. The number is still in raw value, for example the GDP per Capita for Gabon is 13206.4845. The number will be easier to read if presented as 13,206.4845 with comma separator for every 3 digits.

gdpPercap: 13206.4845
lifeExp: 56.735

Good Tooltip

Gabon (name of the country)
GDP per Capita: 13,206.48
Life Expectancy: 56.735

  • (1 points) Choosing right color scheme

Color must be carefully chosen to serve a purpose and it must be clear and do not distract the reader. One of the most common pitfalls is using color for bar charts when a large number of categories are present. You can read more about issues of color usage in this book’s chapter.

2.2.6 Exploration

  • (2 points) Exploration

The Shiny Dashboard taught in this class is just the foundation, and there are many other aspects that can be explored. and here are some of the aspects that can be explored:

  1. Theme Customization: Implementing various themes using the shinythemes package to change the overall look of the dashboard.
  2. Use of HTML and CSS: Adding HTML and CSS elements to beautify and customize the appearance of the dashboard as needed.
  3. Use of Additional Widgets: Integrate various widgets from the shinyWidgets package to enrich the functionality of the dashboard.
  4. Use of Additional Visualization Packages: Other than plotly, you can use other interactive visualization packages. For example, you can use highcharter, echarts4r, or other interactive plot packages in R.

And there are many other things to explore other than what is mentioned above.

2.2.7 Total Score

If you achieved all those criteria you will get total 30 points for your capstone project.

NOTES

The scoring of each detail of the rubric is binary, meaning that each detail will either get full points or zero.

3 Submission

You are expected to submit the link from your deployed shiny to the google classroom via private comment. We will evaluate your dashboard and give feedback based on the submitted link.

NOTES

If you use confidential data, you still need to deploy your dashboard and take a screenshot of your deployed dashboard. After that, you can take down the deployed dashboard on shinyapps.io and submit your capstone project in zip.

4 Reference

Here are some references that you can use to complete your project work:

4.2 Shiny dashboard exploration

Here are some reference that you can use to explore more about shiny dashboarding:

4.3 Demo from Algoritma

Here some demo to making dashboard from algoritma:

4.4 Good Example

Here some good example of Capstone DV from Algoritma Student:

4.5 Data Source

Here are some reference dataset sources that you can use to work on your project:

5 Tips

  • Set your size. By default, Shiny limits file uploads to 5MB per file. You can modify this limit by using the shiny.maxRequestSize option. For example, adding this to the top of app.R would increase the limit to 200MB.

options(shiny.maxRequestSize=200*1024^2)

  • Use RDS for save your files.

saveRDS()

readRDS()


6 Planning Your Project

You can start to work and plan your project after the briefing end. We expect you can finish answering the first objective today so you can focus on building the shiny dashboard for the rest of the week.

6.1 Estimated Time

Below is our estimated time to finish answering the first objective (5W+1H questions) during the briefing day.

6.2 Example

Below is our example of answering the 5W+1H questions above. The final shiny dashboard for the following example can be seen here.

6.2.1 What (ETC: 10 Mins)

I want to show how every country around the world manages its natural resources, shown by the value of Ecological Footprint and Biocapacity of these countries.

A country experiencing Ecological Deficit is indicated by the behavior of that country that imports Biocapacity through trade, liquidation of national ecological assets or emits a lot of carbon dioxide emissions into the air. Meanwhile, countries are said to have Ecological Reserves when Biocapacity (how many natural resources are owned) exceeds Ecological Footprint (how many natural resources are used). Thus, countries that have Ecological Footprint bigger than Biocapacity have the potential to suffer from various ecological impacts such as natural disasters, land damage, loss of biodiversity, and other things that can have negative impacts on the environment and the country’s economy.

For this project, I specifically want to:

  • Increase awareness among people toward the ecological status of their country
  • Show the relation between human development index with the ecological footprint of the country to see if country with higher human resource index would also have higher ecological footprint
  • Provide ecological status and other metrics that can shows ecological activity of each country in the world

6.2.2 Who (ETC: 10 Mins)

This dashboard is created as a medium of education for common people regarding usage and preservation of countries natural resources.

6.2.3 Why and When (ETC: 40 Mins)

The dataset that is suitable for this project is the ecological data acquired from Global Footprint Network (http://data.footprintnetwork.org/). The data was updated on August 12, 2019 so it is still relevant with the current condition.

6.2.4 How (ETC: 30 Mins)

Explain how to achieve each goals or purposes stated on What question.

  • Increase awareness among people toward the ecological status of their country

I will create a plot that shows the Ecological Footprint and Biocapacity for each region.

  • Show the relation between human development index with the ecological footprint of the country to see if country with higher human resource index would also have higher ecological footprint

Create a scatterplot between human development index dengan ecological footprint

  • Provide ecological status and other metrics that can shows ecological activity of each country in the world

Create a leaflet map with relevant information for the popup

6.2.5 Where (ETC: 30 Mins)

Determine the skecth or design for the layout of the dashboard.

6.2.5.1 Overview

On this section, the plot and information for each tab/page will be listed. The final shiny dashboard of the following example can be seen here.

Menu 1:

  • Bar chart of ecological footprint
  • Bar chart of biocapacity of each region
  • Scatter plot between Human Development Index (HDI) and the Ecological Footprint

Menu 2:

  • Leaflet map with popup that shows relevant information about the economic and ecological status.
  • Proportion between biocapacity and ecological footprint between countries on the same region

Menu 3:

  • Raw Dataset

6.2.5.2 Detailed Layout

  • Menu 1: Overview

The first row shows the bar chart of ecological footprint and biocapacity of each region

The second row shows the scatter plot between Human Development Index (HDI) and the Ecological Footprint

  • Menu 2:

The first row shows the leaflet map with popup that shows relevant information about the economic and ecological status.

The second row shows the proportion between biocapacity and ecological footprint between countries on the same region

The third row shows bar chart of each natural resource of a country and info box regarding their ecological status

  • Menu 3:

The third tab shows the raw data.

7 Instructions Regarding Plagiarism

Plagiarism is strictly prohibited in this course, and we place a strong emphasis on academic integrity. This includes the use of pre-made visualizations that have been presented in class or obtained from other sources without proper attribution.

All visualizations submitted for assignments and projects must be your original work. Reusing visualizations created by others, whether from class or external sources, without acknowledgment is considered plagiarism and is a violation of our academic policies.