Summary

Row

The clinical sponsors who have completed atleast 10 clinical trials are included in the analysis. The Category column is derived by performing a wildcard search to determine if a sponsor is from Academic or Hospital, and so there is always a possibility of incorrect assignment. See the Glossary tab for column descriptions.
Source data is publicly available at: https://aact.ctti-clinicaltrials.org
Detailed explaination of the analytics are here at https://businessintelligencedw.blogspot.com/

Row

Total Sponsors

2234

Non-Industry Sponsors

1587

Industry Sponsors

647

Registered by Ind

57845

Completed by Ind

44137

Results by Industry

17498

Row

Sponsors by Industry Type (1=Industry, 0=Non-Industry)

Sponsors by Category

KPI - 1

Row

Industry Vs NonIndustry (Completed Vs Results)

Academic Vs Hospital (Completed Vs Results)

Row

Collaboration by Industry

Global Collab by Industry

KPI - 2

Row

Collab by Sponsor Category

Collaboration by Sponsor Category2

Facilities

Plots3

Data

Glossary


List of column names and descriptions:
Table: Sponsors
1. SponsorName - The full name of the sponsor who sponsored the clinical trial.
2. IndFlag - The flag to indicate if the sponsor belongs to Industry (1) or Non-Industry (0).
3. Category - The category of the sponsor, Academic or Hospital. Derived doing a wildcard on name.
4. Reg - The number of studies registered by the sponsor.
5. Completed - The number of studies with status marked as ‘completed’ by the sponsor.
6. Results - The number of completed studies for which the study results were posted.
7. Ratio - The ratio percentage of Results/Completed.
8. Countries - The number of countries in which the sponsor conducted the studies.
9. IndCol - Number of collaborators from industry who collaborated with the sponsor.
10. NonIndCol - Number of collaborators from non-industry who collaborated with the sponsor.
11. MedianDur - The median of study duration (from study start to completed) in years.
12. Conditions - The number of medical conditions for which the studies were conducted by the sponsor.
13. Facilities - Number of clinical study facilities (aka sites) the studies were conducted.
14. Recruiting - The number of clinical studies that the sponsor is currently recruiting.
15. RegToEnrolDays - The median of days it took from registering a study to begin subject enrollment by sponsor.
16. MedEnrollment - The median number of patients (subjects) enrolled by a sponsor for all studies.

---
title: "Sponsor Dashboard"
output: 
  flexdashboard::flex_dashboard:
    orientation: rows
    vertical_layout: fill
    social: ["twitter", "facebook", "menu"]
    source_code: embed
---

```{r setup, include=FALSE}
library(flexdashboard)
library(dplyr)
library(sqldf)
library(ggplot2)
library(ggdendro)
library(plotly)
library(DT)

#set paths for data files
in_path_sponsors<-"C:/Users/ranamanohar/Documents/MSR_Docs/BigData/Data/ACCT/Extracts/AggSponsor1.txt"

#reads the data files into dataframes with filter criteria
#sponsors<-read.csv.sql(in_path_sponsors, "Select * from file where #Completed>=10", sep = "\t")

sponsors2<-read.csv(in_path_sponsors,sep = "\t", fileEncoding = "UTF-8-BOM")

sponsors<-sponsors2 %>%
  filter(Completed>=10)

sponsors$IndFlag<- as.factor(sponsors$IndFlag)
sponsors$total_collab<- sponsors$IndCol+sponsors$NonIndCol

sponsors_ind<-sponsors %>%
  filter(IndFlag==1)

sponsors_nonind<-sponsors %>%
  filter(IndFlag==0)

sponsors_acad<-sponsors %>%
  filter(Category=="Academic")

sponsors_hosp<-sponsors %>%
  filter(Category=="Hospital")

agg_cnt<-sponsors %>%
  filter(IndFlag==1) %>%
  summarise(cnt_registered_ind=sum(Reg),
            cnt_completed_ind=sum(Completed),
            cnt_results_ind=sum(Results),
            avg_countries_ind=mean(Countries)
            )

```

Summary
=================================

Row {data-height=170}
---------------------------------
The clinical sponsors who have completed atleast 10 clinical trials are included in the analysis. The Category column is derived by performing a wildcard search to determine if a sponsor is from Academic or Hospital, and so there is always a possibility of incorrect assignment. See the Glossary tab for column descriptions. 

Source data is publicly available at: https://aact.ctti-clinicaltrials.org
Detailed explaination of the analytics are here at https://businessintelligencedw.blogspot.com/ Row {data-height=170} --------------------------------- ### Total Sponsors ```{r} valueBox(nrow(sponsors), icon = "fa-capsules") ``` ### Non-Industry Sponsors ```{r} valueBox(sum(sponsors$IndFlag==0), icon="fa-hospital-alt") ``` ### Industry Sponsors ```{r} valueBox(sum(sponsors$IndFlag==1), icon="fa-industry", color="#FF8731") ``` ### Registered by Ind ```{r} valueBox(sum(agg_cnt$cnt_registered_ind), icon="fa-industry", color="#FF8731") ``` ### Completed by Ind ```{r} valueBox(sum(agg_cnt$cnt_completed_ind), icon="fa-industry", color="#FF8731") ``` ### Results by Industry ```{r} valueBox(sum(agg_cnt$cnt_results_ind), icon="fa-industry", color="#FF8731") ``` Row --------------------------- ### Sponsors by Industry Type (1=Industry, 0=Non-Industry) ```{r message=FALSE, warning=FALSE} plot_1_1<-sponsors %>% group_by(IndFlag) %>% summarise(cnt=n()) %>% plot_ly(values=~cnt, labels=~factor(IndFlag), type='pie', marker=list(colors=c('#1F77B4', '#FF8731'))) plot_1_1 ``` ### Sponsors by Category ```{r message=FALSE, warning=FALSE} plot_1_1<- sponsors %>% group_by(Category) %>% summarise(cnt=n()) %>% plot_ly(x=~Category, y=~cnt, type='bar', text=~cnt, textposition = 'auto') plot_1_1 ``` KPI - 1 ================================== Row --------------------------------- ### Industry Vs NonIndustry (Completed Vs Results) ```{r message=FALSE, warning=FALSE, paged.print=FALSE} fitlm_ind <- lm(Results ~ Completed, data = sponsors_ind) fitlm_nonind <- lm(Results ~ Completed, data = sponsors_nonind) #summary(fitlm_nonind) ind_r_square<-round((summary(fitlm_ind)$r.squared),2) ind_coeff<-round((summary(fitlm_ind)$coefficients[2,1]),2) nonind_r_square<-round((summary(fitlm_nonind)$r.squared),2) nonind_coeff<-round((summary(fitlm_nonind)$coefficients[2,1]),2) plot31<-plot_ly() %>% add_trace(data=sponsors_ind,x=~Completed,y=~Results, mode="markers", hoverinfo = 'text', text = ~paste('Sponsor Name: ',SponsorName, '
Studies Completed: ', Completed, '
Results Posted: ', Results, '
Ratio: ', Ratio), marker=list(color="#FF7549"), name="Ind") %>% add_trace(data=sponsors_nonind,x=~Completed,y=~Results, mode="markers", hoverinfo = 'text', text = ~paste('Sponsor Name: ',SponsorName, '
Studies Completed: ', Completed, '
Results Posted: ', Results, '
Ratio: ', Ratio), marker=list(color="#36BE90"), name="Non Ind") %>% add_trace(data = sponsors_ind, x = ~Completed, y = fitted(fitlm_ind), mode = "lines", name="Ind", line=list(color="#FF7549", width=3), hoverinfo = 'text', text = ~paste('Slope Coefficient: ', ind_coeff, '
R-Square: ', ind_r_square)) %>% add_trace(data = sponsors_nonind, x = ~Completed, y = fitted(fitlm_nonind), mode = "lines", name="Non Ind", line=list(color="#36BE90", width=3), hoverinfo = 'text', text = ~paste('Slope Coefficient: ', nonind_coeff, '
R-Square: ', nonind_r_square)) %>% layout( xaxis= list(title="Completed Studies", showgrid=TRUE), yaxis= list(title="Results Posted", showgrid=TRUE)) %>% add_annotations(xref="paper", yref="paper", x=1.15, y=1.05, text="Sponsor Industry Flag", showarrow=FALSE) plot31 ``` ### Academic Vs Hospital (Completed Vs Results) ```{r message=FALSE, warning=FALSE} fitlm_acad <- lm(Results ~ Completed, data = sponsors_acad) fitlm_hosp <- lm(Results ~ Completed, data = sponsors_hosp) #summary(fitlm_nonind) acad_r_square<-round((summary(fitlm_ind)$r.squared),2) acad_coeff<-round((summary(fitlm_ind)$coefficients[2,1]),2) hosp_r_square<-round((summary(fitlm_nonind)$r.squared),2) hosp_coeff<-round((summary(fitlm_nonind)$coefficients[2,1]),2) plot4<-plot_ly() %>% add_trace(data=sponsors_acad,x=~Completed,y=~Results, mode="markers", hoverinfo = 'text', text = ~paste('Sponsor Name: ',SponsorName, '
Studies Completed: ', Completed, '
Results Posted: ', Results, '
Ratio: ', Ratio), marker=list(color="#4F4FFF"), name="Academic") %>% add_trace(data=sponsors_hosp,x=~Completed,y=~Results, mode="markers", hoverinfo = 'text', text = ~paste('Sponsor Name: ',SponsorName, '
Studies Completed: ', Completed, '
Results Posted: ', Results, '
Ratio: ', Ratio), marker=list(color="#B39614"), name="Hospital") %>% add_trace(data = sponsors_acad, x = ~Completed, y = fitted(fitlm_acad), mode = "lines", name="Academic", line=list(color="#4F4FFF", width=3), hoverinfo = 'text', text = ~paste('Slope Coefficient: ', acad_coeff, '
R-Square: ', acad_r_square)) %>% add_trace(data = sponsors_hosp, x = ~Completed, y = fitted(fitlm_hosp), mode = "lines", name="Hospital", line=list(color="#B39614", width=3), hoverinfo = 'text', text = ~paste('Slope Coefficient: ', hosp_coeff, '
R-Square: ', hosp_r_square)) %>% layout( xaxis= list(title="Completed Studies", showgrid=TRUE), yaxis= list(title="Results Posted", showgrid=TRUE)) %>% add_annotations(xref="paper", yref="paper", x=1.15, y=1.05, text="Sponsor Industry Flag", showarrow=FALSE) plot4 ``` Row ----------------------------------- ### Collaboration by Industry ```{r message=FALSE, warning=FALSE} plot1_3<-plot_ly() %>% add_trace(data=sponsors_ind,x=~IndCol,y=~NonIndCol, mode="markers", hoverinfo = 'text', text = ~paste('Sponsor Name: ',SponsorName, '
Industry Collaborators: ', IndCol, '
Non-Industry Collaborators: ', NonIndCol), marker=list(color="#FF7549"), name="Ind") %>% add_trace(data=sponsors_nonind,x=~IndCol,y=~NonIndCol, mode="markers", hoverinfo = 'text', text = ~paste('Sponsor Name: ',SponsorName, '
Industry Collaborators: ', IndCol, '
Non-Industry Collaborators: ', NonIndCol), marker=list(color="#36BE90"), name="Non Ind") %>% layout( xaxis= list(title="Industry Collab", showgrid=TRUE), yaxis= list(title="Non-Industry Collab", showgrid=TRUE)) %>% add_annotations(xref="paper", yref="paper", x=1.2, y=1.05, text="Sponsor", showarrow=FALSE) plot1_3 ``` ### Global Collab by Industry ```{r message=FALSE, warning=FALSE} plot2<-plot_ly(data=sponsors,x=~total_collab,y=~Countries, type = 'scatter', color=~IndFlag, colors="Set2", text=~SponsorName) %>% layout( xaxis= list(title="Collaborators", showgrid=TRUE), yaxis= list(title="Countries", showgrid=TRUE) ) %>% add_annotations( xref="paper", yref="paper", x=1.15, y=1.05, text="Sponsor Industry Flag", showarrow=FALSE ) plot2 ``` KPI - 2 =========================== Row ----------------------------- ### Collab by Sponsor Category ```{r message=FALSE, warning=FALSE} plot3<-plot_ly(data=sponsors,x=~IndCol,y=~NonIndCol, type = 'scatter', color=~Category, colors = "Set1", text=~SponsorName) %>% layout( xaxis= list(title="Industry Collab", showgrid=TRUE), yaxis= list(title="Non-Industry Collab", showgrid=TRUE) ) %>% add_annotations( xref="paper", yref="paper", x=1.15, y=1.05, text="Sponsor Category", showarrow=FALSE ) plot3 ``` ### Collaboration by Sponsor Category2 ```{r} plot3<-plot_ly(data=sponsors,x=~total_collab,y=~Countries, type = 'scatter', color=~Category, colors = "Set1", text=~SponsorName) plot3 ``` ### Facilities ```{r} plot3<-plot_ly(data=sponsors,x=~Countries,y=~Facilities, type = 'scatter', color=~Category, colors = "Set1", text=~SponsorName) plot3 ``` Plots3 ============================================ ```{r} ``` Data ============================================ ```{r} datatable(sponsors, caption="Sponsor Performance Metrics", rownames = TRUE, filter="top", options =list(pagelength=25)) ``` Glossary ============================================
List of column names and descriptions:
Table: Sponsors
1. SponsorName - The full name of the sponsor who sponsored the clinical trial.
2. IndFlag - The flag to indicate if the sponsor belongs to Industry (1) or Non-Industry (0).
3. Category - The category of the sponsor, Academic or Hospital. Derived doing a wildcard on name.
4. Reg - The number of studies registered by the sponsor.
5. Completed - The number of studies with status marked as 'completed' by the sponsor.
6. Results - The number of completed studies for which the study results were posted.
7. Ratio - The ratio percentage of Results/Completed.
8. Countries - The number of countries in which the sponsor conducted the studies.
9. IndCol - Number of collaborators from industry who collaborated with the sponsor.
10. NonIndCol - Number of collaborators from non-industry who collaborated with the sponsor.
11. MedianDur - The median of study duration (from study start to completed) in years.
12. Conditions - The number of medical conditions for which the studies were conducted by the sponsor.
13. Facilities - Number of clinical study facilities (aka sites) the studies were conducted.
14. Recruiting - The number of clinical studies that the sponsor is currently recruiting.
15. RegToEnrolDays - The median of days it took from registering a study to begin subject enrollment by sponsor.
16. MedEnrollment - The median number of patients (subjects) enrolled by a sponsor for all studies.