Starting with the Very Basics

R is a program that was built for statisticians, by statisticians. But it is so flexible that people have been able to use it for many, many other uses. In fact, R is one of the most powerful analytic and visualization tools available to you, and it is free. The only catch is that the learning curve can be a little steep at first.

So, our task here is to give you a chance to play around in R a little and get used to the language, snytax, and other quirks of using this program.

The biggest takeaway from this should be that R is a scripting language. If you have used R Commander, or similar graphic user interface (GUI) in R, then you were using something that essentially ran R commands for you as you selected the various options that you wanted to perform. The scripts were running, either in the background, or as a part of the GUI - as in the case of R Commander, which uses the top window to display all the code that is being run in R. Once you step away from using the point-and-click GUIs, you will begin to realize the potential of scripting your analysis.

The caveat is, of course, that you will also realize the annoyance and frustration of dealing with scripts. Scripts can be a little finicky. So let’s start with a few things that you should know upfront:

  1. Case matters!
  • For example, R does not read what is on the screen - or in its memory - the way a human would. So, R understands x and X to be two entirely different objects, rather than seeing them as different cases of the same letter.
  1. There are packages to help you.
  • R is very flexible and can be made to do many things. But few people would want to tel it what to do from scratch. That is why a lot of what R does has been automated in pre-written scripts that you can use to carry out common tasks. Those pre-written scripts are called “packages”, and some of those packages are loaded into R every time you open R on your computer. Othe packages are avaialble either, in the CRAN (Comprehensive R Archive Network) repository, or elsewhwere on the web.
  • The network analysis package igraph is one such “package”. It was written to make it easier for people to use common network analysis routines without first having to spend hours programming a computer. Although igraph is powerful, and has many functions and features, it does not do everything that we need. We will therefore be using other packages, such as sna, network, foreign, and statnet in later modules. Those packages, however, were written by other people and will therefore have their own idiosyncracies and styles.
  • For now, we will focus on the way igraph does things.
  1. You are not alone!
  • You do not need to memorize everything about R in order to get a lot out of it. Getting help is fairly straightforward - both in R and on the web. There is a massive international community of R users who have likely see whatever problem you are having and have proposed multiple solutions to it. For most problems, a simple web search - either by describing your problem, or by cutting and pasting whatever warning you are getting - should give you multiple results.
  • Though, if your problem is directly related to how to run a particular command, then R’s built-in help files should be able to give you what you need. To use the help function in R, you can use either help() or ?. For example, try typing the following commands into the R console.
help(plot)
# or
?plot
# You should get the same result either way.
  • Both commands do the same thing: they open the R help section. If you are working in the basic R console, then the help section should have opened in a separate window. Those who are using an IDE like R Studio will see the help section open in the help window - which defaults to the lower-right corner, unless you customize it to be somewhere else.
  1. Comments
  • There are two comments in the example above. They are identified by the #, which tells R to ignore everything after the hash on that particular line.
  • One of the best things about working with codes and scripts is that you never have to remember what you clicked to get a particular result. But that doesn’t mean that you will always remember what you were thinking as you worked. This is why I always implore people who are starting out in R to please put comments in their work.
  • Comments allow you to keep track of what you are thinking as you work. As you are writing a script, simply add in a hash (#) followed by whatever note you would like to leave for yourself or others. That way, you will not have to guess what a particular chunk of code was meant to do, or even where you found it.
  • The added bonus to being able to use # is the ability to “turn off” chunks of your code by simply putting a hash in front of it.
  1. Spacing
  • Spacing within a code chunk is not important. R will ignore extra spaces within your code, as long the space does not separate things that must be adjacent, such as functions and their parentheses (function()).
  • Glance through the code presented below. You will note that R will treat 2+2 the same way that it treats 2 + 2 or 2 + 2. Try it out and see what you think.
  1. Naming Conventions in R
  • When you name objects in R, as we do below, it is generally a good idea - though not necessary - to keep the name brief. Though, keep in mind that names that you create for objects and functions should be in the form of a single word. Your options are to combine words using periods, hyphens, camelbacked (capitalized) words, or some shortened word that represents what you want to call the object. For example, if you are creating a new network, you can name it “new.network”, “new_network”, “NewNetwork”, “newnet”, or any other single word or letter.
  • Also keep in mind that there are a lot of packages and functions that are already loaded into R. Be careful not to overwrite any of them with your new name. In other words, be careful not to use a word that is already in use.




That should be enough background to get you started for now. Let’s move on to some simple uses of R.




Calculation

As you have, no doubt, worked out by now, R can perform both simple and complex calculations. Just to belabor the obvious, try a few mathematical functions using +, -, *, and ?.

2+2
## [1] 4
2*2
## [1] 4
2/2
## [1] 1
2-2
## [1] 0
# You can also use a range of numbers.
2:5  # This will create a list of numbers from 2 to 5.
## [1] 2 3 4 5
  # Multiply the range by some number.
2:5 * 5
## [1] 10 15 20 25
# of put it all together...
(2+2/(2*2))-2
## [1] 0.5
# Surely, you get the point. Try this out with any values you like and get as complicated as you like.

Unlike Excel, you do not need to use the equal sign (=) to tell R that you are writing a formula. R assumes that you are, and will tell you if what you are writing is not logical.

2==2
## [1] TRUE
2==3
## [1] FALSE

Notice that the equal sign is doubled in the examples above. That is because a single equal sign has a very special function in R and other coding languages: assignment. From here on, any time that you are writing a logical test in R, you will use a double equal sign. Alternatively, you can ask R whether two things are items are different (not equal) by using !=. This is necessary since few keyboards have a “not equal” sign. Imagine the exclamation point as a slash through the equal sign.

2!=3
## [1] TRUE




Assignment

Single Values

Simple calculations are all well and good. But we want R to be able to do things for use if we fill in a few values. So, for that reason, we will frequently want to use placeholders that we’ll call “objects”. Such objects can represent a particular value, a series of values, or even an entire function. All you are essentially doing is “naming” a number, word, phrase, or procedure.
Note: You can use either a single equal sign (=) or <- for assignment. I strongly prefer <-, since I feel that it makes my code easier to read. But you are welcome to decide for yourself.

  # You can create a data object that represents the value "26" with either: 
x <- 26
  # Then type:
x
## [1] 26
  # or, 
x = 26
x
## [1] 26
# Then try using the object:
2 + x
## [1] 28

You can also do the same with words (enclosed within parentheses).

y <- "Try this now."
y
## [1] "Try this now."



I will no longer include the output in the examples below. Try them out for yourself in R to get a feel for what they can do.

Concatenation

You can also put several values into one object. To do this, you will use the c() function. The “c” stands for concatenate, meaning to link stuff together in a series. (And yes, I am using “stuff” as a technical term.)
Try some of the following.

A <- c(11, 22, 33, 44, 55) # Separate values with commas, or R will return a warning.
  # Then call it up:
A
  # Or use non-numeric values.
B <- c("Try", "this", "list", "of", "words.") # Note that non-numeric values are enclosed in parentheses.

I am assigning these values to letters here out of laziness. You can also use entire words to name objects.

numbers <- c(11, 22, 33, 44, 55)
words <- c("Try", "this", "list", "of", "words.")


Object Types
  # Make some vectors
numbers <- c(11, 22, 33, 44, 55)                  # Vector of numeric values
words <- c("Try", "this", "list", "of", "words.") # Vector of characters
combo <- c(1, "word", 2, "more words", 3)         # Vector of characters and numbers
rnj   <- c(3:7)                                   # Vector of numbers from 3 to 8.
logic <- c(TRUE, FALSE, TRUE, FALSE, FALSE)       # Logical vector

Then, use typeof() to assess whether the object consists of numeric values, character values, or other values.
Note: When you have a mix of numeric and non-numeric values in a vector, R will see everything in that vector as a character string. That means that you will sometimes have an extra, non-numeric entry in your vector, and it will make R think it is looking at a character string.

  # Try:
typeof(combo)
## [1] "character"

Now test your skills a little. See if you can tell what is wrong with the following:

other <- c(o, 1, 2, 3, 4)
# You will see similar SNAFUs as you continue to use R.



Cool Stuff to do with Objects

There are a number of things that you can do with the objects that you create. You can apply a mathematical function to all values in the object, and create a new object based on those values.

  # Multiply a vector of numbers (use the vector you named "numbers") by some value
numbers * 4
A * 4

  # Divide by some number
numbers / 4

  # Create a new object using the modified values.
New.Numbers <- numbers * 5

  # Compare the two:
numbers
New.Numbers


Combining Vectors and Strings

Each of the above objects may be combined to form a matrix, data frame, or other object.

To do this, you will need to use the cbind() command to bind the vectors into columns, or the rbind() command to bind them into rows. Try each to get a feel for what they can do.

You will also see a command for transposing (rotating) a matrix (t()) and for finding out what class an object is in R (class()).

  # Again, start with some vectors.
numbers <- c(11, 22, 33, 44, 55)                  # Vector of numeric values
words <- c("Try", "this", "list", "of", "words.") # Vector of characters
combo <- c(1, "word", 2, "more words", 3)         # Vector of characters and numbers
rnj <- c(3:7)                                   # Vector of numbers from 3 to 7.
logic <- c(TRUE, FALSE, TRUE, FALSE, FALSE)       # Logical vector

# Then, combine them.
Numeric.Object <- cbind(numbers, rnj)
  # Take a look at the object.
Numeric.Object
  # Ask R what type of object it is.
class(Numeric.Object)  # Another way of testing

# Try binding them into rows using rbind(). 
Numeric.Object2 <- rbind(numbers, rnj)

# Put them all together.
Mixed.Object <- rbind(numbers, words, combo, rnj, logic)
  # Then look at it.
Mixed.Object

# If you put the object into rows by mistake, you can transpose the matrix using t().
New.Mixed.Object <- t(Mixed.Object)
  # Now take a look at it.
New.Mixed.Object



Calling up one or More Elements of an Object

Suppose you are interested in using just one of the elements in a particular data object. All you have to do is tell R which one you are interested in calling up. You can do so using the name of the data object followed by square brackets ([ ]).
If you are dealing with a vector (a string of numbers or characters/words), then you just use the number that corresponds with the element you are interested in calling. For example, consider the vector we just created called “combo”. If we are interested in calling up just the forth element of that vector (in this case, that is “more words”.), then you may type: combo[4].

This works the same way with matrices. But matrices have more coordinates: rows and columns. So inside the square brackets, need to specify which row and which column, separated by a comma. The square brackets will always list the row number first, and then column number. For example, if you would like to call up or use the fifth row of the second column of what we called “Mixed.Object” above, then you would type: Mixed.Object[5,2]

You can do the same thing with arrays (stacked matrices), but that is something you can google. For now, let’s play around with using elements of vectors and arrays.

# Using the objects we created above, look at an element of the vectors.
words[1]  # The first element of the vector "words"
words[2]  # The second element of the vector "words"
words[3:5]  # The third, forth, and fifth elements of the vector "words"

# Then look at the elements in a matrix.
Numeric.Object[4, 1]    # This calls the element in row 4, column 1.
Numeric.Object[1:5, 2]  # This calls rows one through five in column 2.
Numeric.Object[ , 2]    # If you would like to call up all the elements of a particular column, leave the rows blank.
Numeric.Object[4 , ]    # Same goes for rows. Here is everything in row four of "Mixed.Object".

# Now to use these new skills:
Numeric.Object[1,1] * Numeric.Object[ , 2]  # Multiply the 4th column in the object by the first entry in column one.
New.Stuff <- Numeric.Object[1,1] * Numeric.Object[ , 2]  # Do it again, but save it as a vector named "New.Column".

Loading and Saving Data

There are many, many ways to get data into R. For the sake of simplicity, we will just focus on a couple here.

The first, and most obvous, is to simply type it in. We already did this above when we created some vectors. Provided all the vectors are the same length, you can make a data object from the vectors you created in the example above by binding them together by column (cbind()).

Be careful when you combine vectors. If they are not the same length, then R will try to make them the same length by repeating the pattern in the shorter vectors to fill in the blanks.

dataObject <- cbind(numbers, words, combo, rnj, logic)

At this point, you have a matrix. To look at the matrix you have created, enter its name into R. To verify, ask R what class of object you have. You can also get an idea of the dimensions (dim) of the object to find out how many rows and columns you have.

class(dataObject)  # For object type
## [1] "matrix"
dim(dataObject)    # For object dimenstions
## [1] 5 5

As you can see, above, you have a matrix with 5 rows and 5 columns. (Rows are always given first and columns are given second.) To make this into a data frame, just tell R that is what you want. To do so, it is a good practice to create it under a new name so that you don’t overwrite your earlier work. Of course, this will only be necessary if you make a mistake in your coding. So, if you don’t make mistakes, then feel free to keep the same name.

Below, we are converting dataObject to a data frame and renaming the resulting data object “dataFrame”.

dataFrame <- as.data.frame(dataObject)

To save the data frame as R data that you can load again later, the command is save. But, you will have to specify what it is you are saving and you will have to specify what name you want to call it when you save it. The proper suffix for R data is rda.

save(dataFrame, file="dataFrame.rda")

You may also chose to save the data as a CSV file so that you can open it later in a text editor, Excel, or similar. “CSV” stands for comma separated values. So, each value in the data set is separated by a comma. The command is write.csv, and like the save command, you will have to specify what you are saving and what you wish to call it once it is saved. Also, make sure to use the .csv suffix.

write.csv(dataFrame, file="dataFrame.csv")

Once you have started a new session, you can use the data you created by reading it back into R. The command you use to read data into R will depend on the type of data you are reading. For the two examples above, you will use either load, for R data, or read.csv or read.table for text files like a csv.

Note that the R data will remember whatever you had named it last time it was saved. The other two commands, on the other hand, will require you to name the data you are importing. Here, we are naming the data “d1” and “d2”.

load("dataFrame.rda")

d1 <- read.csv("dataFrame.csv", header=TRUE) # retain column headers

d2 <- read.table("dataFrame.csv", sep=",", header=TRUE) # point out that values are separated by commas

That is it for now. I will add to this as I can later. For more information on how this applies to social network analysis in R, check out Sean Everton’s examples. https://www.seaneverton.com/a-brief-introduction-to-sna