library(knitr)
library(readr)
library(dplyr)
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
library(tidyr)
library(broom)
library(ggplot2)
library(DataExplorer)

R Markdown

This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see http://rmarkdown.rstudio.com.

When you click the Knit button a document will be generated that includes both content as well as the output of any embedded R code chunks within the document. You can embed an R code chunk like this:

All.subcort.Vol <- read.csv("/projects/neda/FINAL.ANALYSIS.T1/QCd_All_data/subCortVol.All.csv")
All.subcort.Vol$EduCateg <- as.factor(All.subcort.Vol$EduCateg)
All.subcort.Vol$Sex <- as.factor(All.subcort.Vol$Sex)
All.subcort.Vol$ICV <- as.numeric(All.subcort.Vol$ICV)

Missing values

##   rows columns discrete_columns continuous_columns all_missing_columns
## 1  331      24                6                 18                   0
##   total_missing_values complete_rows total_observations memory_usage
## 1                   94           237               7944        92624

To visualize frequency distributions for all discrete features

## 1 columns ignored with more than 50 categories.
## ID: 331 categories

To visualize distributions for all continuous features

QQ plot

QQ plot by Dx group

To visualize correlation heatmap for all non-missing features

## 3 features with more than 5 categories ignored!
## ID: 331 categories
## Dx: 7 categories
## EduCateg: 7 categories

To visualize correlation heatmap for only continuous features

To perform and visualize PCA on some selected features - variance explained by PC

Boxplots data visualization

Scatterplots data visualization

#create_report(All.subcort.Vol)

Note that the echo = FALSE parameter was added to the code chunk to prevent printing of the R code that generated the plot.