## here() starts at D:/Dropbox/R notebooks

1 Background

Using Rstudio makes learning and running the program R for statistical analysis much easier than trying to run R in its own native interface. Rstudio has a lot of extra support and help files which make using R a little less painful. But you will still have to learn that R is a programming language and not a statistical package like Excel or SPSS. There are things “inside R” but they need scripts to bring them into R and to analyse them. You cannot use a menu based tool.

The advantages of this are that R code is reproducible and your analysis can easily be shared and reproduced by anyone else. They only need to have access to the original data and the script that you used for the analysis.

R also has a massive library of functions for every possible analysis that you could want to carry out. As it is open source people are always adding their own specialised functions to the code library and if you need access to the latest methods it is likely that they are available in R.

2 A First Look at Rstudio

When you first open Rstudio you have a screen which is divided into four separate panels as shown in figure 1.

Figure 1: The Rstudio Window

Figure 1: The Rstudio Window

In your case it will be completely blank the first time that you run it but once you have run Rstudio or if you open Rstudio by clicking on an R script file or R markdown file then it will open with some content.

In this case the R script file that was opened was called basic_plots.R. There is currently nothing loaded inside the R program and so the environment is empty. The console should be empty but it is giving a warning message about a package that it has installed called Pathview - IGNORE THIS.

Finally it is showing the contents of the Home Folder which is where Rstudio starts and saves its files. I will explain changing this later.

The four panels are:

  1. The Script panel
  2. The Environment panel(can also be the History panel)
  3. The Console
  4. The Files panel (which can also be the Plots, Help and Package panel)

2.1 The Script Panel

This is where you will create and edit new scripts. You can also use this panel to execute existing scripts or parts of scripts. It is used for editing and debugging. Rstudio allows you to create a wide variety of different script types which are listed in the File > New File menu (figure 2)

Figure2: The New File menu

Figure2: The New File menu

In the example there is a simple R script which is called basic_plots.R (figure 3). You real code should look nothing like this. It should have a better structure. It should have a header describing what it does and when it was created as well as the author and it should be well commented. This has none of that but it was a short file that I had around and that gives some output.

Figure3: The Script Panel

Figure3: The Script Panel

2.2 The Environment Panel

This panel tells you about the data and objects that have been loaded into the R environment. These are all of the things that can have functions applied to them within R and they are also the things generated by that analysis. This includes variables anbd vectors that are assigned in the code as well as data loaded from files.

Figure4: The Environment Panel

Figure4: The Environment Panel

In this case there are two objects called x and y which are vectors of a single variable. These were created by the lines of code in the example file:

x <- read_sav("bones.sav")

This reads the data in the SPSS format file bones.sav and creates an R object called x inside R containing that data. In this case there are 40 numbers each of which is a bone length and is of a single variable. This would be a column of 40 numbers.

y <- read_sav("anaesthetic.sav")

This reads the data in the SPSS format file anaesthetic.sav and creates an R object called y inside R containing that data. In this case there are 35 numbers recording the effectiveness of the anaesthetic.

If you use the Tab function of this panel you can view the History instead of the objects inside the R environment. This allows you to look at all of the commands that have been executed in R.

Figure5: The History Panel

Figure5: The History Panel

In this case I have not done much except set the working directory to the same directory as the script that I was using (more about this later) and then run a few lines of the script.

2.3 The Console

This is where R commands and scripts are actually executed so that you can see the text output that they produce. It is also where you can type R commands interactively. This is a good way of testing out a new command before you include it as a line of code in your script as you are likely to make some errors in formating the command and you need to test to see that it works the way that you expected.

Figure6a: The Console

Figure6a: The Console

In this case I was missing the files from the working directory when I first tried to run the code and so I had to find the files and move them to where R was looking for them.

Figure6b: The Console

Figure6b: The Console

Warnings and errors are written in Red. The commands that were executed are in blue, any output will be in black

2.4 The File/Plots/Packages/Help Panel

This shows you the files in the current directory that R started in. If you change directory for working then you will see a new list of files when you go there. You need to use the More functions to move the Files list to the current working directory.

Figure7: The More Functions

Figure7: The More Functions

Figure8: The Files Panel

Figure8: The Files Panel

Most of the time the file panel is only useful to see the correct name for the data files that you want to open. This panel is more often used in the Plot tab which shows you the output of any code lines that produce graphical output. In the following example a boxplot has been created using the simple example code and the anesthetic data stored in y.

Figure9: The Plots Panel

Figure9: The Plots Panel

Another example is the histogram of bone lengths.

Figure10: The Histogram of Bone Lengths

Figure10: The Histogram of Bone Lengths

If you select Export from the menu then you can save the plot as an image or pdf file and images can be saved as png, jpg and tif.

There are two other tabs that you can use apart from Plots which is the main function of this panel. Those are packages which shows the additional packages and libraries installed in R and Help which gives you a manual of all the commands available in R

Figure11: The Packages Panel

Figure11: The Packages Panel

Packages are libraries that contain additional functions that you can add to the basic functions of R. There are a lot of them and they make coding easier as they extend the basic functions to make many complex analyses simple. They also extend the graphical functions and allow you to use bioinformatics tools.

You install packages from the Tools menu.

Figure12: The Tools Menu

Figure12: The Tools Menu

If you do not know how to use a command or what command to use then you can use the help function. The manual pages are written in a programming manner which is hard to understand but the best section to look at is the set of code examples at the end of the manual page.

Figure13: The Help Panel

Figure13: The Help Panel

3 Useful Tips

3.1 Setting the Working Directory

You can set the directory where R will save the output of the analysis and where it will try to find the files that you ask it to read in in the Session menu.

Figure14: Setting the Working Directory

Figure14: Setting the Working Directory

3.2 Opening a File Which Requires Libraries to be Installed

Sometimes you will either be using someone else’s code or you might be using R on a new machine and you will open an R script and get a warning about libraries that need to be installed on that machine for the code to work.

[Usually you only need to install the libraries once on a machine]

It will look like the warning that you can see in figure 15.

Figure14: Automatic Library Installation

Figure14: Automatic Library Installation

Just click on the yellow coloured warning install link and the libraries that you need will be installed automatically. This is just another one of Rstudio’s useful features.