Harold Nelson
March 29, 2016
This is an R Markdown presentation. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see http://rmarkdown.rstudio.com.
When you click the Knit button a document will be generated that includes both content as well as the output of any embedded R code chunks within the document.
There are some things you should always do, but they won’t find all problems.
The more questions you ask, the more problems you’ll find.
For the dataframe as a whole do head(), tail(), str() and summary().
For numeric variables: - hist(x) - summary(x) - boxplot(x) - plot(density(x))
For qualitative variables.
Relationships
You need to examine the data you have left after taking care of the bad data.
Do you still have a random sample of your population?