R and RStudio Tips and Tricks (Guide) - Part 2

Mark Bounthavong

30 August 2025; updated: 12 September 2025

Introduction

Previoulsy, I created a “Tips and Tricks (Guide) for R,” which introduced users to the R environment, particularly RStudio. In this article, I wanted to discuss some other features of R that would be helpful for new users, specifically vectors, matrices, and dataframes.

I encourage users to review the “Tips and Tricks (Guide) for R,” before diving into this one.

Note: During the preparation of this guide, the R version that I’m using is 4.4.2, and the RStudio version is 2024.12.1+563 (2024.12.1+563).

Here is a list of topics this article will cover:

R objects - vectors
R objects - matrices
R objects - dataframes

R objects - vectors

In the previous article, I introduced R objects and how they are used in the R environment. I want to expand on this because there are many type of objects you can create in R, and each of these have unique features and properties.

Recall that an object can represent anything that is created in R. An object can be a value, vector, matrix, data frame, and results from a function. We refer to things that we create or assign something to an object. For example, we can assign a value of 5 to an object called x. Once we do this, we can use the object in a variety of ways. In this example, I printed the value of the object using print(x) function. The output will generate a value of 5.

x <- 5

print(x)

## [1] 5

But what is a vector or a matrix?

A vector is a list of items that are of the same type. For example, I can create a list of numbers or a list of characters and texts.

The c() function is short for combine. We use the c() function to combine data into a vector.

I use the str() function to display the strucure of the vector. This will tell me what data types are contained in each vector.

x <- c(1, 2, 3, 4, 5) # Vector of numbers from 1 to 5
y <- c("A", "B", "C", "D") # Vector of characters from A to D
z <- c("yellow", "red", "blue") # Vector of words or character strings

str(x)

##  num [1:5] 1 2 3 4 5

str(y)

##  chr [1:4] "A" "B" "C" "D"

str(z)

##  chr [1:3] "yellow" "red" "blue"

The vector x contains numeric data. The vectors y and z contain character data (also known as strings or texts).

Notice that each vector contains the same type of data. In the first vector x, the data type includes discrete integers. In the second vector y, the data type includes characters. The last vector z also includes characters, but these are in string or text form.

If you mix these data types in a vector, what happens. Well, let’s find out.

x <- c(1, "A", 3, "candy", 5) # Vector of numbers from 1 to 5

str(x)

##  chr [1:5] "1" "A" "3" "candy" "5"

Notice that the number 1 is a character and not a numeric value. Why isn’t is a numeric? It’s because R assumed for us that this vector must contain characters because some of the data include characters such as "A" and "candy".

Keep this in mind when you are creating a vector of numeric and/or character data types. It can impact how you can use these vectors in your programming.

R objects - matrices

A matrix is an array of values arranged into rows and columns. Since a matrix is arranged into rows and columns, it is 2-dimensional.

Here is an example of a simple 2 x 2 matrix.

Matrix example

We can use R to create a matrix using the matrix() function. We will denote this matrix as m1. (Note: Recall the matrix is an object, so we can denote this object with a name. In this example, we denote the object (matrix) as m1.)

m1 <- matrix(c(1, 2, 3, 4), # Values in the matrix
             nrow = 2, # Number of rows
             ncol = 2, # Number of columns
             byrow = FALSE) # Order by columns first

print(m1)

##      [,1] [,2]
## [1,]    1    3
## [2,]    2    4

Notice that the numbers in the matrix are ordered by column first followed by the rows.

Matrix example 1

We can change the byrow = FALSE argument to byrow = TRUE to arrange the numbers by prioritizing the rows over the columns. This will change the arrangement of the matrix (see Matrix example 2). We will denote this new matrix as m2.

m2 <- matrix(c(1, 2, 3, 4), # Values in the matrix
             nrow = 2, # Number of rows
             ncol = 2, # Number of columns
             byrow = TRUE) # Order by rows first

print(m2)

##      [,1] [,2]
## [1,]    1    2
## [2,]    3    4

Matrix example 2

Matrices are important for computational operations. In some cases, we will need to use matrices to perform operations such as additions, subtractions, multiplications, and divisions.

In biostatistics, matrices and vectors help in performing regression analysis with multiple variables.

Here is an example of a simple linear regression model in matrix and vector forms:

Linear regression model in matrix form

R objects - dataframes

A dataframe in R is a 2-dimensional data structure that contains various vectors with different data types in a single object. In almost all cases, the dataframe is a tabular structure. More importantly, since it’s an object, we can use it in a variety of ways such computational and applied statistics.

We can convert a matrix into a dataframe using the data.frame() function.

Let’s suppose we had 4 individuals with a unique identifier id. Each individual has their age measured in units of years, which is denoted by the variable age. They are also assigned to a group denoted by the grouping variable group. Individuals can be in Group == 0 or Group == 1.

Step 1 - Create a vector

First we’ll create vectors for each variable. The data types will be numeric (not character). There are three vectors (id, age, and group). For this exercise, we will input the values for each vector.

Step 2 - Convert vectors into a matrix

Then, we will create a matrix using the vectors. We will denote this matrix as m3.

Step 3 - Convert a matrix into a dataframe

Afterwards, we will convert the matrix into a dataframe. We will denote this as df1.

Step 4 - Add labels to the dataframe

Lastly, we will add labels to the variables: id, age, and group.

Here is the R code:

## Step 1: Create vectors of the variables
id <- c(1, 2, 3, 4)
age <- c(45, 65, 37, 29)
group <- c(0, 0, 1, 1)

## Step 2: Create a matrix with the vectors of variables
m3 <- matrix(c(id, age, group), 
             nrow = 4,
             ncol = 3,
             byrow = FALSE)

## Step 3: Transform matrix into a dataframe
df1 <- data.frame(m3)

## Step 4: Label the variables in the dataframe
names(df1) <- c("ID", "Age", "Group")

## Print the dataframe and inspect
print(df1)

##   ID Age Group
## 1  1  45     0
## 2  2  65     0
## 3  3  37     1
## 4  4  29     1

Once the matrix has been converted to a dataframe, we can perform operations on the data inside. For example, we can estimate the mean (average) age of the individuals in the dataframe using the mean() function. Inside the mean() function, we have to include the variable we want to estimate the mean for (e.g., age). To do this, we need to include the dataframe df1 and the variable Age. But we do this using the $ to let R know that the Age variable is part of the dataframe df1.

mean <- mean(df1$Age)

print(mean)

## [1] 44

The mean age of the sample in the dataframe is 44 years.

Final thoughts

In this exercise, we were able to create vectors and matrices with the intention of using these to create a dataframe. A dataframe allows us to perform statistical analysis such as descriptive analysis or inferential analysis. Knowing how data are structured in R will help users to better undrestand the nuances associated with data, particularly when performing statistical analysis.

Now that we have gone through some examples of vectors, matrices, and dataframes, you will be better prepared to navigate the R environment and perform statistical computation and analyses in the future.

Disclaimers

This is a work in progress and subject to updates in the future.

This is for educational purposes only.