In the previous note, I provided few examples of using R in calculations. In that note, I also explained that we can assign names to the results of our calculations and refer to those names in subsequent operations. All of the examples dealt with a single value. This isn’t very useful, as a typical analysis uses data on hundreds, if not thousands, of individuals.

In R, single values can be combined into larger data structures. The simplest such structure is called “vector”. There are several types of vectors: numeric, logical, character, and also a special type called “factor”. In the present note, I am taking a closer look at numeric vectors, that is, vectors composed of real numbers.

Numeric vectors

Individual values are combined to a vector using the function c(). For instance, suppose we ask six employees of an organisation — Anne, Bart, Carl, Dawn, Edna, and Fred, how long (in months) they have been employed in that organisation. Their responses are as follows: 42, 40, 34, 53, 49, 36. That is, we learn that Anne has worked for 42 months in the organisation, Bart has worked for 40 months, and so forth. We can combine these numbers into a vector as follows:

empl <- c(42, 40, 34, 53, 49, 36)

This creates a numeric vector empl.

Working with vectors

We can apply standard mathematical transformations to numeric vectors. For instance, suppose that we want to change the information about employment in months into years. There are 12 months in a year, so we must divide each element of the vector by 12. This is done as follows:

empl/12

## [1] 3.500 3.333 2.833 4.417 4.083 3.000

Changing months to weeks is also possible. An average month has 30 days, of which 8 are Saturdays and Sundays, leaving a total of 22 working days in an average month. There are 5 working days in a working week, so the total number of working weeks in an average month is \(22/5 = 4.4\). Thus, to change the employment in months into weeks we have to multiply each element in empl by 4.4:

empl * 4.4

## [1] 184.8 176.0 149.6 233.2 215.6 158.4

Adding a constant to a vector and subtracting a constant from a vector works similarly: the constant is added to or subtracted from each element of the vector, respectively.

We are often interested, however, in changing each element of a vector by a different amount. For instance, suppose we asked our interviewees if there were any breaks in their employment in the organisation due to sick leave, parental leave, training, sabbatical, or some other reason. That is, a break refers here to an extended period during which they were still employees of the organisation even though they were not actually working there for reasons related to their health, parental status, and the like. If there were such breaks, then we want to know how long they lasted. Suppose that the self-reported durations (in months) of these interruptions are as follows: 12, 3, 0, 15, 0, 6. Let us combine these numbers to a vector that we’ll name breaks:

breaks <- c(12, 3, 0, 15, 0, 6)

We now want to calculate how long our respondents have worked in the organisation excluding the interruptions. That is, from each number representing the total duration of employment in the organisation we want to subtract a correspondning number representing the total duration of the interruptions. In still other words, from each element in the vector empl we want to subtract a corresponding element of the vector breaks. This is done easily by simply subtracting the latter vector from the former:

empl - breaks

## [1] 30 37 34 38 49 30

Instead of the difference, we could also take a ratio of one vector to the other. Suppose, we divide the vector breaks by empl:

breaks/empl

## [1] 0.2857 0.0750 0.0000 0.2830 0.0000 0.1667

The result tells us what proportion of the time in the organisation the respondents were not actually working.

In general, for two (or more) vectors of the same length (i.e., with the same number of elements), arithmetic transformations are performed element-wise, so that an element in one vector is combined with a corresponding element (or elements) of the other vector (or vectors).

Suppose, however, that we recorded the interruptions data somewhat carelessly and, as a result, we forgot to include the information about Fred’s interruptions. That is, suppose that the breaks vector has five, instead of six elements, and looks like this:

breaks

## [1] 12  3  0 15  0

Suppose also that we somehow forgot to notice that the resulting vector is one element too short and go on to subtract breaks from empl even though the two vectors differ in length. The output below shows what happens:

empl - breaks

## Warning in empl - breaks: longer object length is not a multiple of shorter
## object length

## [1] 30 37 34 38 49 24

Well, we are advised that the two vectors have different lengths, but otherwise everything seems to be alright, right? So, what happened here? R performed the subtraction element-wise, as in previous examples: the first element in breaks was subtracted from the first element in empl, then the second element in breaks was subtracted from the second element in empl, and so on, and so forth up to the fifth element of each vector. Then, we run into trouble, since the sixth element in empl doesn’t have a “counterpart” in breaks. We can see from the result, however, that some value was subtracted from that sixth element in empl. Which value? Well, since there is no sixth element in breaks, R substituted its first element for it. That is, we would obtained the same result as above if breaks was as follows:

## [1] 12  3  0 15  0 12

The general rule is that arithmetic operations can be performed on vectors that differ in length; the elements of the shorter vector will be “recycled” to continue with the calculations.

Numeric vectors

Zbigniew Karpiński

November 09, 2017

Numeric vectors

Working with vectors