Last updated on November 19, 2021
Practice, practice, practice! and… be veeeery patient…
It will be frustrating at first, it is for everyone, and google is your best friend during this process.
You will need R (https://www.r-project.org). I would also recommend installing RStudio (https://www.rstudio.com), it is a very nice user interface for working with R and will make your life much easier. Both are free.
Introduction to R by Wenables and Smith. An older version of this document was titled “Introduction to R in 90 minutes”. While it will take you longer to go through this document, once you do it will give you a good foundation and understanding of how to program in R, even if you have no prior programming experience. I highly recommend it!
R style guide by Hadley Wickham - a short section that will help you write cleaner code. It is a part of Hadley Wickham’s book Advanced R. Don’t be intimidated by the title, once you get through Introduction to R you are ready to start learning about programming from Hadley. I would recommend starting with Foundations sections from Data structures to Functions in the Advanced R book.
Common pitfalls and mistakes, an excellent chapter from Advanced R course by Florian Privé, it will save you vast amounts of time you would undoubtedly spend on debugging.
Anything written by Hadley Wickham is great, as is this book: R for data science by Grolemund and Wickham. By now you’ve probably gathered that I am a huge fan of Hadley Wickham, here is Hadley’s wikipedia page, and here is his personal page with more great resources on learning R and a link to his sister’s, Charlotte Wickham’s, page. She also teaches R and data analysis in an accessible way. They both make everything they produce freely available online.
The next step would be to familiarize yourself with tidyverse, it represents a paradigm shift in R programming and data analysis that is taking biostatistics/bioinformatics/machine learning/data science world by storm. Read more about it here. My favourite part of tidyverse is ggplot2 for plots and visualizations. Examples and reference material are here. Note, your data will need to be in a tidy data format for use by ggplot2.
Statistical Inference via Data Science: a moderndive into R and the tidyverse by Ismay and Kim deals with performing statistical analyses in R with lots of examples. It is also a great reference for learning R markdown and bookdown as the source code for the book is freely availabe in the “about this book”" section. Another great reference for learning R markdown is Writing documents with R Markdown and a for more details see R Markdown: The Definitive Guide by Xie, Allaire, and Grolemund.
There are several organizations providing online R programming courses, and their number is growing. Most are interactive with a short video followed by an interactive excercise. The introductory courses are usually free, but most are not, prices vary. The best ones I have seen so far are DataCamp and Coursera.
Good luck and drop in during my Wednesday drop-in hours if you have questions!
Before performing analyses, a dataset needs to be cleaned and formatted in a certain way. Here are some excellent references that will help you with this process: