The purpose of this project is to showcase class data using R Markdown tools, R libraries and R packages including flexdashboard, R Shiny and HTML Widgets
R markdown
Great Resources for R Markdown are the following free online books - R Markdown - The Definitive Guide and R Markdown Cookbook
Use R Markdown to publish a group of related data visualizations as a dashboard.
Support for a wide variety of components including Htmlwidgets; base, lattice, and grid graphics; tabular data; gauges and value boxes; and text annotations.
Flexible and easy to specify row and column-based layouts. Components are intelligently re-sized to fill the browser and adapted for display on mobile devices.
Storyboard layouts for presenting sequences of visualizations and related commentary.
Optionally use Shiny to drive visualizations dynamically.
A variety of themes are available to modify the base appearance of flexdashboard. Available themes include: - default - cosmo - bootstrap - cerulean - journal - flatly - readable - spacelab - united - lumen - paper - sandstone - simplex - yeti
CSS - CSS stands for Cascading Style Sheets and describes how HTML elements are to be displayed on screen, paper, or in other media. CSS saves a lot of work. It can control the layout of multiple web pages all at once by using external stylesheets stored as CSS files. Tutorial for applying Custom CSS to R markdown storyboards can be found - here
R Shiny - Shiny is an open source R package that provides an elegant and powerful web framework for building web applications using R. It helps you create interactive and dynamic dashboards similar to Tableau / Power Bi. Shiny helps you turn your analyses into interactive web applications without requiring HTML, CSS, or JavaScript knowledge.
Email: spalla@kumc.edu
Course Description: This web-based course addresses issues in professionalism, leadership and ethics that are specific to students training to become statisticians, biostatisticians, and data scientists.
Strong correlation between College Rank and Tuition. However, Correlation does not imply Causation.
University of Kansas - Rank = 118, Tuition = $25,932
Median Rank = 111, Median Tuition = $31,608
An Introduction to corrplot Package
Total No. of Students = 19
M = 16, F = 3
Women are underrepresented in STEM Majors due to various reasons. Therefore, KUDOS to all the ladies in our class.
Total No. of Students = 19
Midwest = 10, West Coast = 1, East Coast = 3
5 students didn’t share their current location in the “Introduction” videos.
Animals / Pets, Reading and Travel seemed to be the most common hobbies / interests among students
Other interesting hobbies / interests include Photography, Swimming, Tuning Pianos, Tracking Data in reading, Bowling, Chemistry Watching TV, Electronics, Outdoor activities and Spending time with family
Majority of Undergrad Degrees were in Life Sciences
Wide variety of reasons for pursuing this degree including transitioning careers, getting promoted, and gaining additional skills.
Top 3 students with most engaging posts were Jack, Sophia and Michael
Early posters had higher chance of getting replies.
Activity and Engagement fell as the Semester progressed.
Discussions 8,9, and 10 had the least amount of activity and engagement.
---
title: "Stat-805 Summer-2023 - Final Project"
output:
flexdashboard::flex_dashboard:
theme: united
storyboard: true
social: menu
source: embed
css: "su_styles.css"
---
## STAT 805 - Presenting Class Data using R Markdown, R Shiny and Html Widgets
<body>
<div class="col-2-3">
The purpose of this project is to showcase class data using R Markdown tools, R libraries and R packages including flexdashboard, R Shiny and HTML Widgets
<b>R markdown</b>
- R markdown is a simple and easy to use plain text language used to combine your R code, results from your data analysis (including plots and tables) and written commentary into a single nicely formatted and reproducible document (like a report, publication, thesis chapter or a web page like this one).
Great Resources for R Markdown are the following free online books - <b><a href="https://bookdown.org/yihui/rmarkdown/" target="_blank">R Markdown - The Definitive Guide</a></b> and <b><a href="https://bookdown.org/yihui/rmarkdown-cookbook/" target="_blank">R Markdown Cookbook</a></b>
<img src="cookbook.png" alt="Cookbook" class="Photoright" />
<img src="definitive.png" alt="Definitive Guide" class="Photoleft" />
<b><a href="https://pkgs.rstudio.com/flexdashboard/index.html" target="_blank">Flexdashboard</a></b>
- Use R Markdown to publish a group of related data visualizations as a dashboard.
- Support for a wide variety of components including Htmlwidgets; base, lattice, and grid graphics; tabular data; gauges and value boxes; and text annotations.
- Flexible and easy to specify row and column-based layouts. Components are intelligently re-sized to fill the browser and adapted for display on mobile devices.
- <b><a href="https://pkgs.rstudio.com/flexdashboard/articles/layouts.html" target="_blank">Storyboard layouts</a></b> for presenting sequences of visualizations and related commentary.
- Optionally use Shiny to drive visualizations dynamically.
- A variety of <b><a href="https://pkgs.rstudio.com/flexdashboard/articles/using.html" target="_blank">themes</a></b> are available to modify the base appearance of flexdashboard. Available themes include: - default - cosmo - bootstrap - cerulean - journal - flatly - readable - spacelab - <b>united</b> - lumen - paper - sandstone - simplex - yeti
<b><a href="https://www.w3schools.com/css/css_intro.asp">CSS</a></b> - CSS stands for Cascading Style Sheets and describes how HTML elements are to be displayed on screen, paper, or in other media. CSS saves a lot of work. It can control the layout of multiple web pages all at once by using external stylesheets stored as CSS files. Tutorial for applying Custom CSS to R markdown storyboards can be found - <b><a href="https://bookdown.org/yihui/rmarkdown-cookbook/html-css.html" target="_blank">here</a></b>
<b><a href="https://www.rstudio.com/products/shiny/#:~:text=Shiny%20is%20an%20open%20source,opens%20in%20a%20new%20tab" target="_blank">R Shiny</a></b> - Shiny is an open source R package that provides an elegant and powerful web framework for building web applications using R. It helps you create interactive and dynamic dashboards similar to Tableau / Power Bi. Shiny helps you turn your analyses into interactive web applications without requiring HTML, CSS, or JavaScript knowledge.
</div>
<div class="col-1-3">
<img src="professor1.jpg" alt="Female Professor" class="center" />
<h3>Instructor Information</h3>
<h4 style="text-align:center;">Shana L. Palla, EdD, PStat</h4>
<p style="text-align:center;"><b>Email</b>: spalla@kumc.edu</p>
<b>Course Description:</b> This web-based course addresses issues in professionalism, leadership and ethics that are specific to students training to become statisticians, biostatisticians, and data scientists.
```{r out.width='100%', echo=FALSE}
library(leaflet)
leaflet() %>% addTiles() %>%
setView(-94.60914860650841, 39.05602617791458, zoom = 17) %>%
addPopups(
-94.60914860650841, 39.05602617791458,
'Location of <b>KU Medical Center</b>'
)
```
</div>
```{r setup, include=FALSE}
library(flexdashboard)
```
<div class="row">
### University of Kansas Undergrad Ranking, Tuition and Enrollment Stats compared to other Universities using ggplot2
```{r}
library(readr)
library(plotly)
hsb2 <- read_csv("rankings_2017.csv")
plot_ly(data = hsb2, x = ~Tuition, y = ~Rank, text = ~Name,type = "scatter", mode = "markers", marker = list(size = 12, colorbar = list(title = "Enrollment (in thousands)"), color = ~Enroll, colorscale='Viridis', reversescale =T)) %>%
layout(title = "College Rank versus Tuition (Year = 2017)", xaxis = list(title = "Tuition"),
yaxis = list(title = "Rank"))
```
***
<h2>Observations</h2>
<img src="corr1.png" alt="Definitive Guide" class="corr"/>
- Strong correlation between College Rank and Tuition. However, Correlation does not imply Causation.
- University of Kansas - Rank = 118, Tuition = $25,932
- Median Rank = 111, Median Tuition = $31,608
<h2>Related Links</h2>
<b><a href="https://cran.r-project.org/web/packages/corrplot/vignettes/corrplot-intro.html" target="_blank">An Introduction to corrplot Package</a></b>
<b><a href="https://data.world/education/university-rankings-2017" target="_blank">University Rankings 2017 - Dataset</a></b>
<b><a href="https://plotly.com/ggplot2/" target="_blank">Plotly ggplot2 Library</a></b>
### Gender Distribution of Class Using pie3D
```{r}
library(plotrix)
data <- c(16, 3)
lab <- paste0(round(data/sum(data) * 100, 2), "%")
pie3D(data,main= 'Gender Distribution of Students, Blue = Male, Pink = Female',radius=0.9,explode=0.3,
col=c("lightblue","pink"),
labels = lab)
```
***
<h2>Observations</h2>
- Total No. of Students = 19
- M = 16, F = 3
- Women are underrepresented in STEM Majors due to various reasons. Therefore, KUDOS to all the ladies in our class.
<h2>Related R Library</h2>
<b><a href="https://r-charts.com/part-whole/pie3d/" target="_blank">pie3D function with plotrix</a></b>
### Current Location of students using Plotly Graphing Library
```{r}
library(tidyverse)
library(plotly)
set.seed(1)
density <-c(0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,7,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,1,0,0,0,1,1,0,0,0,0,2,0,0,0,0,0,0,0)
g <- list(
scope = 'usa',
projection = list(type = 'albers usa'),
lakecolor = toRGB('white')
)
plot_ly() %>%
layout(title = 'Current Location of Students',geo = g) %>%
add_trace(type = "choropleth", locationmode = 'USA-states',
locations = state.abb,
z = ~density, text = state.name,
color = ~density, autocolorscale = TRUE) %>%
add_trace(type = "scattergeo", locationmode = 'USA-states',
locations = state.abb, text = paste0(state.abb, "\n", density),
mode = "text",
textfont = list(color = rgb(0,0,0), size = 12))
```
***
<h2>Observations</h2>
- Total No. of Students = 19
- Midwest = 10, West Coast = 1, East Coast = 3
- 5 students didn't share their current location in the "Introduction" videos.
<h2>Related Library</h2>
<b><a href="https://plotly.com/r/maps/" target="_blank">Plotly R Graphing Library - Maps</a></b>
### Student Hobbies and Interests using Text Mining and Word Cloud
```{r}
library("tm")
library("SnowballC")
library("wordcloud")
library("RColorBrewer")
hobbies <- "C:/Users/sidd0/OneDrive/Desktop/Stat-823/hobbies.txt"
text <- readLines(hobbies)
docs <- Corpus(VectorSource(text))
docs <- tm_map(docs, content_transformer(tolower))
# Remove english common stopwords
docs <- tm_map(docs, removeWords, stopwords("english"))
# Remove punctuations
docs <- tm_map(docs, removePunctuation)
# Eliminate extra white spaces
docs <- tm_map(docs, stripWhitespace)
dtm <- TermDocumentMatrix(docs)
m <- as.matrix(dtm)
v <- sort(rowSums(m),decreasing=TRUE)
d <- data.frame(word = names(v),freq=v)
set.seed(1234)
wordcloud(words = d$word, freq = d$freq, min.freq = 1,
max.words=200, random.order=FALSE, rot.per=0.35,
colors=brewer.pal(8, "Dark2"))
```
***
<h2>Observations</h2>
- Animals / Pets, Reading and Travel seemed to be the most common hobbies / interests among students
- Other interesting hobbies / interests include Photography, Swimming, Tuning Pianos, Tracking Data in reading, Bowling, Chemistry
Watching TV, Electronics, Outdoor activities and Spending time with family
<h2>Related Topic(s)</h2>
<b><a href="http://www.sthda.com/english/wiki/text-mining-and-word-cloud-fundamentals-in-r-5-simple-steps-you-should-know" target="_blank">Text mining and Word Cloud Fundamentals in R</a></b>
### Undergrad Degree and Reason to Pursue Master's Degree using interactive Data Tables from HTML Widgets
```{r}
library(DT)
library(readr)
reason <- read_csv("reason.csv")
datatable(reason, options = list(pageLength = 8))
```
***
<h2>Observations</h2>
- Majority of Undergrad Degrees were in Life Sciences
- Wide variety of reasons for pursuing this degree including transitioning careers, getting promoted, and gaining additional skills.
<h2>Related Topic(s)</h2>
<b><a href="https://www.htmlwidgets.org/showcase_leaflet.html" target="_blank">htmlwidgets for R</a></b>
### Current Occupation of Students using Collapsible Trees
```{r}
library(collapsibleTree)
library(readr)
library(dplyr)
Stat_805 <- read_csv("Stat-805.csv")
Stat_805 %>%
group_by(Gender,CurrentOccupation,Occupation) %>%
collapsibleTreeSummary(
hierarchy = c("Gender","CurrentOccupation","Occupation"),
root = "Stat_805",
width = 800
)
```
***
<h2>Observations</h2>
- Wide range in current Occupations
<h2>Related Topic(s)</h2>
<b><a href="https://adeelk93.github.io/collapsibleTree/" target="_blank">Creating Collapsible Trees - Geography Example
</a></b>
### Discussion Board Activity by Each Student Using Stacked Bar Charts Plotly
```{r}
library(plotly)
Students <- c("Emily","Jack","Ummer","Michael","Anthony","Alice","Oliver","David","Ethan","Sophia","Ben","Henry","Mason","Rob","James","Dan","Noah","John","Tom"
)
Discussion1 <- c(7,4,5,2,4,4,3,3,3,2,2,2,2,2,1,1,1,1,1)
Discussion2 <- c(2,3,2,2,2,3,3,2,2,2,2,2,2,2,2,2,2,2,2)
Discussion3 <- c(3,3,0,3,4,3,2,2,3,2,2,2,2,2,2,2,2,2,2)
Discussion4 <- c(4,4,4,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2)
Discussion5 <- c(3,3,3,3,2,2,2,2,2,3,2,2,2,2,2,2,2,2,2)
Discussion6 <- c(3,3,4,2,3,2,2,2,2,2,2,2,2,2,2,2,2,2,2)
Discussion7 <- c(2,2,4,4,2,3,2,2,1,2,2,2,2,2,2,2,2,2,2)
Discussion8 <- c(2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2)
Discussion9 <- c(3,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2)
Discussion10 <- c(2,2,2,3,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2)
data <- data.frame(Students ,Discussion1,Discussion2,Discussion3,Discussion4,Discussion5,Discussion6,Discussion7,Discussion8,Discussion9,Discussion10)
fig <- plot_ly(data, x = ~Students, y = ~Discussion1, type = 'bar', name = 'Discussion 1')
fig <- fig %>% add_trace(y = ~Discussion2, name = 'Discussion 2')
fig <- fig %>% add_trace(y = ~Discussion3, name = 'Discussion 3')
fig <- fig %>% add_trace(y = ~Discussion4, name = 'Discussion 4')
fig <- fig %>% add_trace(y = ~Discussion5, name = 'Discussion5')
fig <- fig %>% add_trace(y = ~Discussion6, name = 'Discussion 6')
fig <- fig %>% add_trace(y = ~Discussion7, name = 'Discussion 7')
fig <- fig %>% add_trace(y = ~Discussion8, name = 'Discussion 8')
fig <- fig %>% add_trace(y = ~Discussion9, name = 'Discussion 9')
fig <- fig %>% add_trace(y = ~Discussion10, name = 'Discussion 10')
fig <- fig %>% layout(title = "Discussion Board Activity",yaxis = list(title = 'Count'), barmode = 'stack')
fig
```
***
<h2>Observations</h2>
- Top 3 most active students were Emily, Jack and Ummer.
<h2>Related Topic(s)</h2>
<b><a href="https://plotly.com/r/bar-charts" target="_blank">Creating Stacked Bar Charts with Plotly in R</a></b>
### Discussion Board Engagement - No. Of Replies to each Post Using Stacked Bar Charts Plotly
```{r}
library(plotly)
Students <- c("Jack", "Michael","Sophia","Noah","Ummer","Anthony","Ben","Emily","Oliver","Henry","Alice","Mason","Rob","Ethan","Dan","David","Tom","James","John")
Discussion1 <- c(4,2,0,2,3,1,0,3,0,1,4,2,4,1,2,1,0,1,0)
Discussion2 <- c(2,0,6,1,2,2,1,2,2,0,1,0,1,0,0,1,1,0,0)
Discussion3 <- c(4,3,1,2,0,5,1,1,0,2,1,1,1,1,0,1,1,0,0)
Discussion4 <- c(4,3,4,1,2,2,2,1,1,1,1,1,1,1,0,0,0,0,0)
Discussion5 <- c(2,3,6,1,0,1,3,1,1,2,1,1,0,2,0,0,0,0,0)
Discussion6 <- c(3,1,2,1,4,1,1,3,1,2,0,1,1,0,1,1,0,1,0)
Discussion7 <- c(3,6,1,2,5,1,1,0,1,1,0,0,0,1,0,0,0,1,0)
Discussion8 <- c(2,0,1,5,0,1,1,1,2,0,1,0,1,0,2,0,1,1,0)
Discussion9 <- c(2,4,1,1,1,1,2,0,1,1,0,3,0,1,0,0,2,0,0)
Discussion10 <- c(1,2,2,4,3,0,2,0,1,0,0,0,0,0,1,2,1,0,0)
data <- data.frame(Students , Discussion1,Discussion2,Discussion3,Discussion4,Discussion5,Discussion6,Discussion7,Discussion8,Discussion9,Discussion10)
fig <- plot_ly(data, x = ~Students, y = ~Discussion1, type = 'bar', name = 'Discussion 1')
fig <- fig %>% add_trace(y = ~Discussion2, name = 'Discussion 2')
fig <- fig %>% add_trace(y = ~Discussion3, name = 'Discussion 3')
fig <- fig %>% add_trace(y = ~Discussion4, name = 'Discussion 4')
fig <- fig %>% add_trace(y = ~Discussion5, name = 'Discussion5')
fig <- fig %>% add_trace(y = ~Discussion6, name = 'Discussion 6')
fig <- fig %>% add_trace(y = ~Discussion7, name = 'Discussion 7')
fig <- fig %>% add_trace(y = ~Discussion8, name = 'Discussion 8')
fig <- fig %>% add_trace(y = ~Discussion9, name = 'Discussion 9')
fig <- fig %>% add_trace(y = ~Discussion10, name = 'Discussion 10')
fig <- fig %>% layout(title = "Discussion Board Engagement",yaxis = list(title = 'Count'), barmode = 'stack')
fig
```
***
<h2>Observations</h2>
- Top 3 students with most engaging posts were Jack, Sophia and Michael
- Early posters had higher chance of getting replies.
<h2>Related Topic(s)</h2>
<b><a href="https://plotly.com/r/bar-charts" target="_blank">Creating Stacked Bar Charts with Plotly in R</a></b>
### Total Number of Discussions and No. of Replies Using Grouped Bar Charts Plotly
```{r}
library(plotly)
Discussion <- c("Disc 1","Disc 2","Disc 3","Disc 4","Disc 5","Disc 6","Disc 7","Disc 8","Disc 9","Disc Ten")
Total_Discussion <- c(50,41,43,44,43,43,42,38,39,39)
Total_Engagement <- c(31,22,25,25,24,24,23,19,20,19)
data <- data.frame(Discussion,Total_Discussion,Total_Engagement)
fig <- plot_ly(data, x = ~Discussion , y = ~Total_Discussion , type = 'bar', name = 'Total Discussion Posts',title = "A Simple Histogram")
fig <- fig %>% add_trace(y = ~Total_Engagement, name = 'Total No. of Replies')
fig <- fig %>% layout(title = "Total Number of Discussions and No. of Replies(Engagement)",yaxis = list(title = 'Count'), barmode = 'group')
fig
```
***
<h2>Observations</h2>
- Activity and Engagement fell as the Semester progressed.
- Discussions 8,9, and 10 had the least amount of activity and engagement.
<h2>Related Topic(s)</h2>
<b><a href="https://plotly.com/r/bar-charts" target="_blank">Creating Grouped Bar Charts with Plotly in R</a></b>
### One of my hobbies - Exploring Film Database with R-Shiny
```{r,out.width="100%"}
knitr::include_app("https://gallery.shinyapps.io/051-movie-explorer/", height="100%")
```
***
<h2>Resources</h2>
- <a href="https://www.shinyapps.io/" target="_blank">Deploying Shiny applications on the Web</a>
- <a href="https://datasciencegenie.com/how-to-embed-a-shiny-app-on-website/" target="_blank">How to Embed A Shiny App on Website</a>
- <a href="https://us7923.shinyapps.io/disease/" target="_blank">Example of R Shiny Dashboard I created</a>
</body>