For this course, you will need to install some free and safe software. This document will walk you through installing the tools we will be using in this course.
First, we will install a programming language called R
.
This is the tool we will be using throughout the course to analyze data
and make plots.
R is a programming language that will help you do useful things with data. R looks like this - although it looks like nonsense to you right now, this will make sense very soon! You can try to read through this code below to get a sense of what it might be doing (I’ll put the answer at the very end of this document).
population %>%
group_by(country) %>%
summarise(avg_population = mean(population))
To install R, go to this link: https://cran.r-project.org/ (you may need to copy and paste the link into the URL bar of your browser).
Even if you already have R installed on your computer, you should install the newest version (as of today, that is 4.2.1) if possible, since that is what we will use for this course.
However, please make sure the version you install is compatible with your Operating System. We recommend updating your operating system to be compatible with R 4.2.1 if possible. For example, on Mac R 4.2.1 requires High Sierra or higher. If you would prefer not to update your operating system, you will need to install a “previous release” of R by following the links on the pages below.
On this website, go to the section in the red box below and click on the type of your computer (probably either Windows or Mac).
Choose Mac or PC depending on what kind of computer you have.
If you have a Mac, click on the first .pkg
link (in the
red box below). This will download a file. Open it and follow the
instructions to install it like you would any other program.
If you have a Mac, this screen will look like this.
If you have a Windows computer, first click on base and then Download R 4.1.0 for Windows (in the red boxes below).
If you have a Windows PC, this screen will look like this.
Once that is done, now we need to install a program called RStudio. This is an application that will allow you to interact with the programming language R. RStudio is useful because it will let you see your code, data, plots, and everything else all in one place.
RStudio will help put your data, code, and plots all in one place.
To install RStudio, go to this link: https://rstudio.com/products/rstudio/ (you may need to copy and paste the link into the URL bar of your browser)
Once there, follow the picture below to:
Click on the downloaded file and follow the instructions to install RStudio.
Follow this process to download RStudio.
Congratulations! You’ve installed all of the necessary materials for this course.
When you open the RStudio proggam for the first time, it will look something like this:
RStudio: Your new best friend for the next several weeks!
There are different sections of RStudio, as described below and shown in the picture:
Let’s try to evaluate some code. In the >
part of the
console, try typing in 2 + 7
like this and press
Enter (or return depending on your computer):
Awesome! If that code evaluated to 9
, your installation
is working.
Often, you will want to save your work for later. No code in the console will be saved the next time you open RStudio. If you want to save your work, you need to store it in a file. There are files that will save only code, but it is convenient to save your writing, code, and outputs like plots and figures all in one place.
This is what RMarkdown is for. An RMarkdown file is a document just like Microsoft Word or Google Docs. If you want to save your work, you need to make a document and save.
Let’s make an RMarkdown file by going to
File --> New File --> R Markdown
like in the picture
below:
A window like this should appear - please give it any title you like, make sure HTML is selected, and press OK.
Now go to File --> Save
(or press CMD + S on Mac /
Ctrl + S on PC), give your file a name (something like
day1.Rmd
is fine!), and save it. It would be a good idea to
create a new folder on your computer for this course and save it
there.
There is some setup code there that you don’t need. Go ahead and
delete everything after the ---
. Then, click on the green
Insert button, select R, and you will see a
chunk. You can add code here in the dark gray area.
Write something simple like 2 + 2:
To run your code, you can click on the Knit
button with
the blue yarn to run your code. You should see a new window appear like
this with your finished code:
Congratulations!! If you’ve gotten to this point, you are all set with R and RStudio. You are going to learn a lot together!
Secret answer: let’s read through this code
together. population
is the name of some dataset. It looks
like it also has information on country
and
population
. So what could this be doing?
population %>%
group_by(country) %>%
summarise(avg_population = mean(population))
Well, it looks like we are “grouping” each country
in
some way (group_by(country)
). What are we doing with it?
Well, we’re summarising
it in some way.
avg_population
and mean(population)
look like
we’re taking an average of population. But don’t forget the groups! Put
it all together, and what is this code doing? Taking the average
population within each country in the dataset.