Lab 4

John Yi

Multiple Linear Regression

Introduction

Previously you learned how to do a simple linear regression


Call:
lm(formula = Feature_1 ~ Agreeableness, data = df)

Residuals:
     Min       1Q   Median       3Q      Max 
-0.52482 -0.24147 -0.00173  0.24892  0.51633 

Coefficients:
              Estimate Std. Error t value Pr(>|t|)    
(Intercept)    0.47397    0.01254  37.802  < 2e-16 ***
Agreeableness  0.06119    0.02196   2.787  0.00538 ** 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.2868 on 1998 degrees of freedom
Multiple R-squared:  0.003871,  Adjusted R-squared:  0.003373 
F-statistic: 7.765 on 1 and 1998 DF,  p-value: 0.005378

What if you wanted to see how your outcome variable relates to more than one predictor?

  • That is what we will answering in today’s lab section

Data

Go to the canvas page and download the handwriting.csv file from the website. This dataset was obtained online from Kaggle (link here), and measures five personality traits and fifteen handwriting features.

df <- read.csv("handwriting.csv")
head(df)
  Handwriting_Sample Writing_Speed_wpm  Openness Conscientiousness Extraversion
1       sample_1.jpg                60 0.3572032        0.40744230    0.7249470
2       sample_2.jpg                32 0.7302505        0.05195014    0.3516147
3       sample_3.jpg                10 0.8369869        0.16222749    0.1646810
4       sample_4.jpg                12 0.4134186        0.36305914    0.1315642
5       sample_5.jpg                11 0.6160463        0.24789881    0.9097398
6       sample_6.jpg                44 0.9110092        0.88341165    0.2171105
  Agreeableness Neuroticism Gender Age Feature_1 Feature_2 Feature_3 Feature_4
1     0.4515167 0.255107361   Male  45 0.1461394 0.2898953 0.2838149 0.6174065
2     0.5284133 0.664159164   Male  36 0.8028333 0.5489345 0.4593187 0.7934317
3     0.8160083 0.681869972  Other  34 0.4525125 0.4417056 0.8449200 0.6019600
4     0.9383496 0.236701620   Male  26 0.8326617 0.2792830 0.7489737 0.7981547
5     0.6989657 0.463774230  Other  57 0.9277849 0.8527978 0.9447890 0.7726200
6     0.4119159 0.008873347   Male  52 0.6381071 0.1447402 0.5283650 0.2107954
  Feature_5 Feature_6 Feature_7 Feature_8  Feature_9 Feature_10 Feature_11
1 0.9962502 0.9859272 0.7456262 0.9239223 0.03915508  0.2773610  0.8320975
2 0.5634192 0.8939789 0.1143796 0.4841666 0.02239722  0.4363222  0.9086275
3 0.4807482 0.9412735 0.9505712 0.4856614 0.27738829  0.8735601  0.5669726
4 0.9521904 0.8310414 0.2172441 0.3517005 0.46485856  0.8677578  0.2984069
5 0.5203046 0.7975051 0.1863779 0.1136225 0.29951386  0.2955552  0.3650655
6 0.4109358 0.7391724 0.9867577 0.2644145 0.20727245  0.1459196  0.5125761
  Feature_12 Feature_13 Feature_14 Feature_15
1  0.3191278  0.1992129 0.24108111 0.37597802
2  0.2207438  0.6509474 0.56846426 0.66006170
3  0.2054020  0.5377997 0.32351934 0.37335821
4  0.3502103  0.5911223 0.80204923 0.13132399
5  0.2106333  0.7231948 0.04408516 0.01435112
6  0.5280036  0.4886643 0.33725102 0.15998334
  • This dataset is quite a bit larger than what we use previously (n=2000). However, it should still not take too long to run analyses on it.

  • For this class, I will be exploring the relationship between the personality traits and Feature_1. However, feel free to choose whatever traits you want!

In addition, also copy over the body_image_data.csv from either a previous assignment or the canvas page. We will also be using the iris dataset from R, but there is no need to download any additional files as it is preinstalled in R.

Formula

In R, adding predictors to a model is pretty easy. In a simple regression model, the formula looks like this:

  • y ~ x1

If you wanted to add another predictor (let’s say x2) simply use the + symbol.

  • y ~ x1 + x2

This can work as many predictors as you woud like

  • y ~ x1 + x2 + x3 + x4 ...

Let’s see this in action in the next slide.

Model and Output (two predictors)

Let’s say that I’m interested in seeing the effect of agreeableness and neuroticism on feature 1.

model1 <- lm(Feature_1 ~ Agreeableness + Neuroticism, df)
summary(model1)

Call:
lm(formula = Feature_1 ~ Agreeableness + Neuroticism, data = df)

Residuals:
     Min       1Q   Median       3Q      Max 
-0.52160 -0.24105 -0.00221  0.24585  0.51462 

Coefficients:
              Estimate Std. Error t value Pr(>|t|)    
(Intercept)    0.48342    0.01655  29.216  < 2e-16 ***
Agreeableness  0.06162    0.02196   2.805  0.00507 ** 
Neuroticism   -0.01917    0.02191  -0.875  0.38169    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.2868 on 1997 degrees of freedom
Multiple R-squared:  0.004253,  Adjusted R-squared:  0.003256 
F-statistic: 4.265 on 2 and 1997 DF,  p-value: 0.01418
  • There is now an additional row for a slope estimate (one for agreeableness and one for neuroticisim)

  • The F-statistic has more than one degrees of freedom in its numerator

  • the p-value is different than the p-value for each of the individual slope estimates

Model and Output (five predictors)

As stated previously, we can add in as many predictors as we would like

model2 <- lm(Feature_1 ~ Openness + Conscientiousness + Extraversion + Agreeableness + Neuroticism, df)
summary(model2)

Call:
lm(formula = Feature_1 ~ Openness + Conscientiousness + Extraversion + 
    Agreeableness + Neuroticism, data = df)

Residuals:
     Min       1Q   Median       3Q      Max 
-0.52692 -0.23912 -0.00349  0.24649  0.52444 

Coefficients:
                  Estimate Std. Error t value Pr(>|t|)    
(Intercept)        0.52111    0.02547  20.463   <2e-16 ***
Openness          -0.02980    0.02231  -1.336   0.1818    
Conscientiousness -0.02057    0.02198  -0.936   0.3493    
Extraversion      -0.02339    0.02189  -1.069   0.2853    
Agreeableness      0.06010    0.02198   2.735   0.0063 ** 
Neuroticism       -0.01933    0.02192  -0.882   0.3780    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.2868 on 1994 degrees of freedom
Multiple R-squared:  0.006178,  Adjusted R-squared:  0.003686 
F-statistic: 2.479 on 5 and 1994 DF,  p-value: 0.03009

Compared to the original model:

  • Did the overall p-value go up or down?

  • What about the Multiple R-squared?

  • Which one should you use?*

summary(model)

Call:
lm(formula = Feature_1 ~ Agreeableness, data = df)

Residuals:
     Min       1Q   Median       3Q      Max 
-0.52482 -0.24147 -0.00173  0.24892  0.51633 

Coefficients:
              Estimate Std. Error t value Pr(>|t|)    
(Intercept)    0.47397    0.01254  37.802  < 2e-16 ***
Agreeableness  0.06119    0.02196   2.787  0.00538 ** 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.2868 on 1998 degrees of freedom
Multiple R-squared:  0.003871,  Adjusted R-squared:  0.003373 
F-statistic: 7.765 on 1 and 1998 DF,  p-value: 0.005378

Correlation Matrix

Review

When doing MLR it’s especially helpful to analyze the correlations between variables to make sure that our predictors are actually measuring different constructs. Previously you used cor() to accomplish this.

bi_data <- read.csv("body_image_data.csv")
bi_data1 <- bi_data[, 4:18] # this extracts only the 'aa' columns
cor(bi_data1)
             aa1         aa2          aa3        aa4        aa5         aa6
aa1   1.00000000  0.39168552 -0.347617535 -0.3026124  0.3918437  0.39494891
aa2   0.39168552  1.00000000 -0.323297555 -0.3158219  0.3485875  0.48478960
aa3  -0.34761754 -0.32329756  1.000000000  0.3514916 -0.3456423 -0.25276613
aa4  -0.30261240 -0.31582190  0.351491587  1.0000000 -0.1582926 -0.30834360
aa5   0.39184368  0.34858754 -0.345642338 -0.1582926  1.0000000  0.41431233
aa6   0.39494891  0.48478960 -0.252766125 -0.3083436  0.4143123  1.00000000
aa7  -0.34986628 -0.36211627  0.483222698  0.2930931 -0.5025754 -0.45689961
aa8   0.42821876  0.49784292 -0.271105401 -0.2376849  0.3193003  0.44681057
aa9   0.20295878  0.07761449 -0.034479822 -0.1754912  0.1179471  0.14428393
aa10  0.05308971  0.07953526  0.007862589  0.1600465 -0.1249544 -0.01308786
aa11  0.29753779  0.41839083 -0.530114408 -0.3820109  0.2483536  0.23984235
aa12 -0.17879347 -0.07144637  0.284934199  0.1733338 -0.2739464 -0.08054656
aa13  0.43772486  0.29752133 -0.194504566 -0.1916925  0.1371269  0.24407909
aa14  0.36892108  0.33252539 -0.275426922 -0.1800727  0.2302008  0.25463508
aa15 -0.35947042 -0.41461268  0.472991876  0.3475430 -0.3277955 -0.30926415
            aa7        aa8          aa9         aa10        aa11         aa12
aa1  -0.3498663  0.4282188  0.202958783  0.053089706  0.29753779 -0.178793473
aa2  -0.3621163  0.4978429  0.077614491  0.079535265  0.41839083 -0.071446373
aa3   0.4832227 -0.2711054 -0.034479822  0.007862589 -0.53011441  0.284934199
aa4   0.2930931 -0.2376849 -0.175491241  0.160046526 -0.38201090  0.173333768
aa5  -0.5025754  0.3193003  0.117947136 -0.124954407  0.24835361 -0.273946404
aa6  -0.4568996  0.4468106  0.144283931 -0.013087856  0.23984235 -0.080546556
aa7   1.0000000 -0.3211781 -0.056022402  0.103884996 -0.36523563  0.296506534
aa8  -0.3211781  1.0000000  0.178912951  0.157195106  0.36156663  0.042957499
aa9  -0.0560224  0.1789130  1.000000000 -0.009090200  0.04242672  0.008901962
aa10  0.1038850  0.1571951 -0.009090200  1.000000000  0.08900573  0.343062284
aa11 -0.3652356  0.3615666  0.042426723  0.089005725  1.00000000 -0.144284890
aa12  0.2965065  0.0429575  0.008901962  0.343062284 -0.14428489  1.000000000
aa13 -0.1820533  0.5278414  0.112251812  0.130287908  0.30157214  0.123417914
aa14 -0.3051896  0.4243281  0.076362274  0.067651410  0.48607518 -0.012271093
aa15  0.5017693 -0.2688685 -0.084466265  0.007526115 -0.44557165  0.221915232
           aa13        aa14         aa15
aa1   0.4377249  0.36892108 -0.359470419
aa2   0.2975213  0.33252539 -0.414612679
aa3  -0.1945046 -0.27542692  0.472991876
aa4  -0.1916925 -0.18007267  0.347543031
aa5   0.1371269  0.23020084 -0.327795509
aa6   0.2440791  0.25463508 -0.309264145
aa7  -0.1820533 -0.30518964  0.501769294
aa8   0.5278414  0.42432809 -0.268868510
aa9   0.1122518  0.07636227 -0.084466265
aa10  0.1302879  0.06765141  0.007526115
aa11  0.3015721  0.48607518 -0.445571651
aa12  0.1234179 -0.01227109  0.221915232
aa13  1.0000000  0.49127751 -0.310046676
aa14  0.4912775  1.00000000 -0.337258978
aa15 -0.3100467 -0.33725898  1.000000000

As you can see, this output is hard to see especially when looking at many variables. We can visualize this better by using the corrplot library.

corrplot()

library(corrplot)
correlation <- cor(bi_data1)
corrplot(correlation)

What do you notice about this plot?

Customization

Like many other functions, we can customize the output.

corrplot(correlation, method = "number", type = "upper", number.cex = 0.6)

Data Selection

Selecting by Columns

Let’s go over how you select by columns again:

data(iris) # loads in a pre-installed dataset
head(iris)
  Sepal.Length Sepal.Width Petal.Length Petal.Width Species
1          5.1         3.5          1.4         0.2  setosa
2          4.9         3.0          1.4         0.2  setosa
3          4.7         3.2          1.3         0.2  setosa
4          4.6         3.1          1.5         0.2  setosa
5          5.0         3.6          1.4         0.2  setosa
6          5.4         3.9          1.7         0.4  setosa

Select by column names:

iris[c("Sepal.Length", "Petal.Length")]
    Sepal.Length Petal.Length
1            5.1          1.4
2            4.9          1.4
3            4.7          1.3
4            4.6          1.5
5            5.0          1.4
6            5.4          1.7
7            4.6          1.4
8            5.0          1.5
9            4.4          1.4
10           4.9          1.5
11           5.4          1.5
12           4.8          1.6
13           4.8          1.4
14           4.3          1.1
15           5.8          1.2
16           5.7          1.5
17           5.4          1.3
18           5.1          1.4
19           5.7          1.7
20           5.1          1.5
21           5.4          1.7
22           5.1          1.5
23           4.6          1.0
24           5.1          1.7
25           4.8          1.9
26           5.0          1.6
27           5.0          1.6
28           5.2          1.5
29           5.2          1.4
30           4.7          1.6
31           4.8          1.6
32           5.4          1.5
33           5.2          1.5
34           5.5          1.4
35           4.9          1.5
36           5.0          1.2
37           5.5          1.3
38           4.9          1.4
39           4.4          1.3
40           5.1          1.5
41           5.0          1.3
42           4.5          1.3
43           4.4          1.3
44           5.0          1.6
45           5.1          1.9
46           4.8          1.4
47           5.1          1.6
48           4.6          1.4
49           5.3          1.5
50           5.0          1.4
51           7.0          4.7
52           6.4          4.5
53           6.9          4.9
54           5.5          4.0
55           6.5          4.6
56           5.7          4.5
57           6.3          4.7
58           4.9          3.3
59           6.6          4.6
60           5.2          3.9
61           5.0          3.5
62           5.9          4.2
63           6.0          4.0
64           6.1          4.7
65           5.6          3.6
66           6.7          4.4
67           5.6          4.5
68           5.8          4.1
69           6.2          4.5
70           5.6          3.9
71           5.9          4.8
72           6.1          4.0
73           6.3          4.9
74           6.1          4.7
75           6.4          4.3
76           6.6          4.4
77           6.8          4.8
78           6.7          5.0
79           6.0          4.5
80           5.7          3.5
81           5.5          3.8
82           5.5          3.7
83           5.8          3.9
84           6.0          5.1
85           5.4          4.5
86           6.0          4.5
87           6.7          4.7
88           6.3          4.4
89           5.6          4.1
90           5.5          4.0
91           5.5          4.4
92           6.1          4.6
93           5.8          4.0
94           5.0          3.3
95           5.6          4.2
96           5.7          4.2
97           5.7          4.2
98           6.2          4.3
99           5.1          3.0
100          5.7          4.1
101          6.3          6.0
102          5.8          5.1
103          7.1          5.9
104          6.3          5.6
105          6.5          5.8
106          7.6          6.6
107          4.9          4.5
108          7.3          6.3
109          6.7          5.8
110          7.2          6.1
111          6.5          5.1
112          6.4          5.3
113          6.8          5.5
114          5.7          5.0
115          5.8          5.1
116          6.4          5.3
117          6.5          5.5
118          7.7          6.7
119          7.7          6.9
120          6.0          5.0
121          6.9          5.7
122          5.6          4.9
123          7.7          6.7
124          6.3          4.9
125          6.7          5.7
126          7.2          6.0
127          6.2          4.8
128          6.1          4.9
129          6.4          5.6
130          7.2          5.8
131          7.4          6.1
132          7.9          6.4
133          6.4          5.6
134          6.3          5.1
135          6.1          5.6
136          7.7          6.1
137          6.3          5.6
138          6.4          5.5
139          6.0          4.8
140          6.9          5.4
141          6.7          5.6
142          6.9          5.1
143          5.8          5.1
144          6.8          5.9
145          6.7          5.7
146          6.7          5.2
147          6.3          5.0
148          6.5          5.2
149          6.2          5.4
150          5.9          5.1

Select by column index:

iris[, c(1, 3)]
    Sepal.Length Petal.Length
1            5.1          1.4
2            4.9          1.4
3            4.7          1.3
4            4.6          1.5
5            5.0          1.4
6            5.4          1.7
7            4.6          1.4
8            5.0          1.5
9            4.4          1.4
10           4.9          1.5
11           5.4          1.5
12           4.8          1.6
13           4.8          1.4
14           4.3          1.1
15           5.8          1.2
16           5.7          1.5
17           5.4          1.3
18           5.1          1.4
19           5.7          1.7
20           5.1          1.5
21           5.4          1.7
22           5.1          1.5
23           4.6          1.0
24           5.1          1.7
25           4.8          1.9
26           5.0          1.6
27           5.0          1.6
28           5.2          1.5
29           5.2          1.4
30           4.7          1.6
31           4.8          1.6
32           5.4          1.5
33           5.2          1.5
34           5.5          1.4
35           4.9          1.5
36           5.0          1.2
37           5.5          1.3
38           4.9          1.4
39           4.4          1.3
40           5.1          1.5
41           5.0          1.3
42           4.5          1.3
43           4.4          1.3
44           5.0          1.6
45           5.1          1.9
46           4.8          1.4
47           5.1          1.6
48           4.6          1.4
49           5.3          1.5
50           5.0          1.4
51           7.0          4.7
52           6.4          4.5
53           6.9          4.9
54           5.5          4.0
55           6.5          4.6
56           5.7          4.5
57           6.3          4.7
58           4.9          3.3
59           6.6          4.6
60           5.2          3.9
61           5.0          3.5
62           5.9          4.2
63           6.0          4.0
64           6.1          4.7
65           5.6          3.6
66           6.7          4.4
67           5.6          4.5
68           5.8          4.1
69           6.2          4.5
70           5.6          3.9
71           5.9          4.8
72           6.1          4.0
73           6.3          4.9
74           6.1          4.7
75           6.4          4.3
76           6.6          4.4
77           6.8          4.8
78           6.7          5.0
79           6.0          4.5
80           5.7          3.5
81           5.5          3.8
82           5.5          3.7
83           5.8          3.9
84           6.0          5.1
85           5.4          4.5
86           6.0          4.5
87           6.7          4.7
88           6.3          4.4
89           5.6          4.1
90           5.5          4.0
91           5.5          4.4
92           6.1          4.6
93           5.8          4.0
94           5.0          3.3
95           5.6          4.2
96           5.7          4.2
97           5.7          4.2
98           6.2          4.3
99           5.1          3.0
100          5.7          4.1
101          6.3          6.0
102          5.8          5.1
103          7.1          5.9
104          6.3          5.6
105          6.5          5.8
106          7.6          6.6
107          4.9          4.5
108          7.3          6.3
109          6.7          5.8
110          7.2          6.1
111          6.5          5.1
112          6.4          5.3
113          6.8          5.5
114          5.7          5.0
115          5.8          5.1
116          6.4          5.3
117          6.5          5.5
118          7.7          6.7
119          7.7          6.9
120          6.0          5.0
121          6.9          5.7
122          5.6          4.9
123          7.7          6.7
124          6.3          4.9
125          6.7          5.7
126          7.2          6.0
127          6.2          4.8
128          6.1          4.9
129          6.4          5.6
130          7.2          5.8
131          7.4          6.1
132          7.9          6.4
133          6.4          5.6
134          6.3          5.1
135          6.1          5.6
136          7.7          6.1
137          6.3          5.6
138          6.4          5.5
139          6.0          4.8
140          6.9          5.4
141          6.7          5.6
142          6.9          5.1
143          5.8          5.1
144          6.8          5.9
145          6.7          5.7
146          6.7          5.2
147          6.3          5.0
148          6.5          5.2
149          6.2          5.4
150          5.9          5.1

Select multiple columns by index

iris[, 1:3]
    Sepal.Length Sepal.Width Petal.Length
1            5.1         3.5          1.4
2            4.9         3.0          1.4
3            4.7         3.2          1.3
4            4.6         3.1          1.5
5            5.0         3.6          1.4
6            5.4         3.9          1.7
7            4.6         3.4          1.4
8            5.0         3.4          1.5
9            4.4         2.9          1.4
10           4.9         3.1          1.5
11           5.4         3.7          1.5
12           4.8         3.4          1.6
13           4.8         3.0          1.4
14           4.3         3.0          1.1
15           5.8         4.0          1.2
16           5.7         4.4          1.5
17           5.4         3.9          1.3
18           5.1         3.5          1.4
19           5.7         3.8          1.7
20           5.1         3.8          1.5
21           5.4         3.4          1.7
22           5.1         3.7          1.5
23           4.6         3.6          1.0
24           5.1         3.3          1.7
25           4.8         3.4          1.9
26           5.0         3.0          1.6
27           5.0         3.4          1.6
28           5.2         3.5          1.5
29           5.2         3.4          1.4
30           4.7         3.2          1.6
31           4.8         3.1          1.6
32           5.4         3.4          1.5
33           5.2         4.1          1.5
34           5.5         4.2          1.4
35           4.9         3.1          1.5
36           5.0         3.2          1.2
37           5.5         3.5          1.3
38           4.9         3.6          1.4
39           4.4         3.0          1.3
40           5.1         3.4          1.5
41           5.0         3.5          1.3
42           4.5         2.3          1.3
43           4.4         3.2          1.3
44           5.0         3.5          1.6
45           5.1         3.8          1.9
46           4.8         3.0          1.4
47           5.1         3.8          1.6
48           4.6         3.2          1.4
49           5.3         3.7          1.5
50           5.0         3.3          1.4
51           7.0         3.2          4.7
52           6.4         3.2          4.5
53           6.9         3.1          4.9
54           5.5         2.3          4.0
55           6.5         2.8          4.6
56           5.7         2.8          4.5
57           6.3         3.3          4.7
58           4.9         2.4          3.3
59           6.6         2.9          4.6
60           5.2         2.7          3.9
61           5.0         2.0          3.5
62           5.9         3.0          4.2
63           6.0         2.2          4.0
64           6.1         2.9          4.7
65           5.6         2.9          3.6
66           6.7         3.1          4.4
67           5.6         3.0          4.5
68           5.8         2.7          4.1
69           6.2         2.2          4.5
70           5.6         2.5          3.9
71           5.9         3.2          4.8
72           6.1         2.8          4.0
73           6.3         2.5          4.9
74           6.1         2.8          4.7
75           6.4         2.9          4.3
76           6.6         3.0          4.4
77           6.8         2.8          4.8
78           6.7         3.0          5.0
79           6.0         2.9          4.5
80           5.7         2.6          3.5
81           5.5         2.4          3.8
82           5.5         2.4          3.7
83           5.8         2.7          3.9
84           6.0         2.7          5.1
85           5.4         3.0          4.5
86           6.0         3.4          4.5
87           6.7         3.1          4.7
88           6.3         2.3          4.4
89           5.6         3.0          4.1
90           5.5         2.5          4.0
91           5.5         2.6          4.4
92           6.1         3.0          4.6
93           5.8         2.6          4.0
94           5.0         2.3          3.3
95           5.6         2.7          4.2
96           5.7         3.0          4.2
97           5.7         2.9          4.2
98           6.2         2.9          4.3
99           5.1         2.5          3.0
100          5.7         2.8          4.1
101          6.3         3.3          6.0
102          5.8         2.7          5.1
103          7.1         3.0          5.9
104          6.3         2.9          5.6
105          6.5         3.0          5.8
106          7.6         3.0          6.6
107          4.9         2.5          4.5
108          7.3         2.9          6.3
109          6.7         2.5          5.8
110          7.2         3.6          6.1
111          6.5         3.2          5.1
112          6.4         2.7          5.3
113          6.8         3.0          5.5
114          5.7         2.5          5.0
115          5.8         2.8          5.1
116          6.4         3.2          5.3
117          6.5         3.0          5.5
118          7.7         3.8          6.7
119          7.7         2.6          6.9
120          6.0         2.2          5.0
121          6.9         3.2          5.7
122          5.6         2.8          4.9
123          7.7         2.8          6.7
124          6.3         2.7          4.9
125          6.7         3.3          5.7
126          7.2         3.2          6.0
127          6.2         2.8          4.8
128          6.1         3.0          4.9
129          6.4         2.8          5.6
130          7.2         3.0          5.8
131          7.4         2.8          6.1
132          7.9         3.8          6.4
133          6.4         2.8          5.6
134          6.3         2.8          5.1
135          6.1         2.6          5.6
136          7.7         3.0          6.1
137          6.3         3.4          5.6
138          6.4         3.1          5.5
139          6.0         3.0          4.8
140          6.9         3.1          5.4
141          6.7         3.1          5.6
142          6.9         3.1          5.1
143          5.8         2.7          5.1
144          6.8         3.2          5.9
145          6.7         3.3          5.7
146          6.7         3.0          5.2
147          6.3         2.5          5.0
148          6.5         3.0          5.2
149          6.2         3.4          5.4
150          5.9         3.0          5.1

Selecting by Rows

Selecting by rows is in some ways selecting by columns

iris[0:50,  ]
   Sepal.Length Sepal.Width Petal.Length Petal.Width Species
1           5.1         3.5          1.4         0.2  setosa
2           4.9         3.0          1.4         0.2  setosa
3           4.7         3.2          1.3         0.2  setosa
4           4.6         3.1          1.5         0.2  setosa
5           5.0         3.6          1.4         0.2  setosa
6           5.4         3.9          1.7         0.4  setosa
7           4.6         3.4          1.4         0.3  setosa
8           5.0         3.4          1.5         0.2  setosa
9           4.4         2.9          1.4         0.2  setosa
10          4.9         3.1          1.5         0.1  setosa
11          5.4         3.7          1.5         0.2  setosa
12          4.8         3.4          1.6         0.2  setosa
13          4.8         3.0          1.4         0.1  setosa
14          4.3         3.0          1.1         0.1  setosa
15          5.8         4.0          1.2         0.2  setosa
16          5.7         4.4          1.5         0.4  setosa
17          5.4         3.9          1.3         0.4  setosa
18          5.1         3.5          1.4         0.3  setosa
19          5.7         3.8          1.7         0.3  setosa
20          5.1         3.8          1.5         0.3  setosa
21          5.4         3.4          1.7         0.2  setosa
22          5.1         3.7          1.5         0.4  setosa
23          4.6         3.6          1.0         0.2  setosa
24          5.1         3.3          1.7         0.5  setosa
25          4.8         3.4          1.9         0.2  setosa
26          5.0         3.0          1.6         0.2  setosa
27          5.0         3.4          1.6         0.4  setosa
28          5.2         3.5          1.5         0.2  setosa
29          5.2         3.4          1.4         0.2  setosa
30          4.7         3.2          1.6         0.2  setosa
31          4.8         3.1          1.6         0.2  setosa
32          5.4         3.4          1.5         0.4  setosa
33          5.2         4.1          1.5         0.1  setosa
34          5.5         4.2          1.4         0.2  setosa
35          4.9         3.1          1.5         0.2  setosa
36          5.0         3.2          1.2         0.2  setosa
37          5.5         3.5          1.3         0.2  setosa
38          4.9         3.6          1.4         0.1  setosa
39          4.4         3.0          1.3         0.2  setosa
40          5.1         3.4          1.5         0.2  setosa
41          5.0         3.5          1.3         0.3  setosa
42          4.5         2.3          1.3         0.3  setosa
43          4.4         3.2          1.3         0.2  setosa
44          5.0         3.5          1.6         0.6  setosa
45          5.1         3.8          1.9         0.4  setosa
46          4.8         3.0          1.4         0.3  setosa
47          5.1         3.8          1.6         0.2  setosa
48          4.6         3.2          1.4         0.2  setosa
49          5.3         3.7          1.5         0.2  setosa
50          5.0         3.3          1.4         0.2  setosa

However, rather than selecting by index it is often more helpful to select based on certain conditions. What could these conditions look like?

  • The species is setosa

  • The sepals are less than 6 units long

Selecting by Row based on condition

iris[iris$Species == "setosa", ]
   Sepal.Length Sepal.Width Petal.Length Petal.Width Species
1           5.1         3.5          1.4         0.2  setosa
2           4.9         3.0          1.4         0.2  setosa
3           4.7         3.2          1.3         0.2  setosa
4           4.6         3.1          1.5         0.2  setosa
5           5.0         3.6          1.4         0.2  setosa
6           5.4         3.9          1.7         0.4  setosa
7           4.6         3.4          1.4         0.3  setosa
8           5.0         3.4          1.5         0.2  setosa
9           4.4         2.9          1.4         0.2  setosa
10          4.9         3.1          1.5         0.1  setosa
11          5.4         3.7          1.5         0.2  setosa
12          4.8         3.4          1.6         0.2  setosa
13          4.8         3.0          1.4         0.1  setosa
14          4.3         3.0          1.1         0.1  setosa
15          5.8         4.0          1.2         0.2  setosa
16          5.7         4.4          1.5         0.4  setosa
17          5.4         3.9          1.3         0.4  setosa
18          5.1         3.5          1.4         0.3  setosa
19          5.7         3.8          1.7         0.3  setosa
20          5.1         3.8          1.5         0.3  setosa
21          5.4         3.4          1.7         0.2  setosa
22          5.1         3.7          1.5         0.4  setosa
23          4.6         3.6          1.0         0.2  setosa
24          5.1         3.3          1.7         0.5  setosa
25          4.8         3.4          1.9         0.2  setosa
26          5.0         3.0          1.6         0.2  setosa
27          5.0         3.4          1.6         0.4  setosa
28          5.2         3.5          1.5         0.2  setosa
29          5.2         3.4          1.4         0.2  setosa
30          4.7         3.2          1.6         0.2  setosa
31          4.8         3.1          1.6         0.2  setosa
32          5.4         3.4          1.5         0.4  setosa
33          5.2         4.1          1.5         0.1  setosa
34          5.5         4.2          1.4         0.2  setosa
35          4.9         3.1          1.5         0.2  setosa
36          5.0         3.2          1.2         0.2  setosa
37          5.5         3.5          1.3         0.2  setosa
38          4.9         3.6          1.4         0.1  setosa
39          4.4         3.0          1.3         0.2  setosa
40          5.1         3.4          1.5         0.2  setosa
41          5.0         3.5          1.3         0.3  setosa
42          4.5         2.3          1.3         0.3  setosa
43          4.4         3.2          1.3         0.2  setosa
44          5.0         3.5          1.6         0.6  setosa
45          5.1         3.8          1.9         0.4  setosa
46          4.8         3.0          1.4         0.3  setosa
47          5.1         3.8          1.6         0.2  setosa
48          4.6         3.2          1.4         0.2  setosa
49          5.3         3.7          1.5         0.2  setosa
50          5.0         3.3          1.4         0.2  setosa
  • == comapres the values on either side to see if they are equal. This allows us to check every row and obtain only those rows in which iris$Species is equal to "setosa".

  • We can do similar operations with < or >.

iris[iris$Sepal.Length < 6, ]
    Sepal.Length Sepal.Width Petal.Length Petal.Width    Species
1            5.1         3.5          1.4         0.2     setosa
2            4.9         3.0          1.4         0.2     setosa
3            4.7         3.2          1.3         0.2     setosa
4            4.6         3.1          1.5         0.2     setosa
5            5.0         3.6          1.4         0.2     setosa
6            5.4         3.9          1.7         0.4     setosa
7            4.6         3.4          1.4         0.3     setosa
8            5.0         3.4          1.5         0.2     setosa
9            4.4         2.9          1.4         0.2     setosa
10           4.9         3.1          1.5         0.1     setosa
11           5.4         3.7          1.5         0.2     setosa
12           4.8         3.4          1.6         0.2     setosa
13           4.8         3.0          1.4         0.1     setosa
14           4.3         3.0          1.1         0.1     setosa
15           5.8         4.0          1.2         0.2     setosa
16           5.7         4.4          1.5         0.4     setosa
17           5.4         3.9          1.3         0.4     setosa
18           5.1         3.5          1.4         0.3     setosa
19           5.7         3.8          1.7         0.3     setosa
20           5.1         3.8          1.5         0.3     setosa
21           5.4         3.4          1.7         0.2     setosa
22           5.1         3.7          1.5         0.4     setosa
23           4.6         3.6          1.0         0.2     setosa
24           5.1         3.3          1.7         0.5     setosa
25           4.8         3.4          1.9         0.2     setosa
26           5.0         3.0          1.6         0.2     setosa
27           5.0         3.4          1.6         0.4     setosa
28           5.2         3.5          1.5         0.2     setosa
29           5.2         3.4          1.4         0.2     setosa
30           4.7         3.2          1.6         0.2     setosa
31           4.8         3.1          1.6         0.2     setosa
32           5.4         3.4          1.5         0.4     setosa
33           5.2         4.1          1.5         0.1     setosa
34           5.5         4.2          1.4         0.2     setosa
35           4.9         3.1          1.5         0.2     setosa
36           5.0         3.2          1.2         0.2     setosa
37           5.5         3.5          1.3         0.2     setosa
38           4.9         3.6          1.4         0.1     setosa
39           4.4         3.0          1.3         0.2     setosa
40           5.1         3.4          1.5         0.2     setosa
41           5.0         3.5          1.3         0.3     setosa
42           4.5         2.3          1.3         0.3     setosa
43           4.4         3.2          1.3         0.2     setosa
44           5.0         3.5          1.6         0.6     setosa
45           5.1         3.8          1.9         0.4     setosa
46           4.8         3.0          1.4         0.3     setosa
47           5.1         3.8          1.6         0.2     setosa
48           4.6         3.2          1.4         0.2     setosa
49           5.3         3.7          1.5         0.2     setosa
50           5.0         3.3          1.4         0.2     setosa
54           5.5         2.3          4.0         1.3 versicolor
56           5.7         2.8          4.5         1.3 versicolor
58           4.9         2.4          3.3         1.0 versicolor
60           5.2         2.7          3.9         1.4 versicolor
61           5.0         2.0          3.5         1.0 versicolor
62           5.9         3.0          4.2         1.5 versicolor
65           5.6         2.9          3.6         1.3 versicolor
67           5.6         3.0          4.5         1.5 versicolor
68           5.8         2.7          4.1         1.0 versicolor
70           5.6         2.5          3.9         1.1 versicolor
71           5.9         3.2          4.8         1.8 versicolor
80           5.7         2.6          3.5         1.0 versicolor
81           5.5         2.4          3.8         1.1 versicolor
82           5.5         2.4          3.7         1.0 versicolor
83           5.8         2.7          3.9         1.2 versicolor
85           5.4         3.0          4.5         1.5 versicolor
89           5.6         3.0          4.1         1.3 versicolor
90           5.5         2.5          4.0         1.3 versicolor
91           5.5         2.6          4.4         1.2 versicolor
93           5.8         2.6          4.0         1.2 versicolor
94           5.0         2.3          3.3         1.0 versicolor
95           5.6         2.7          4.2         1.3 versicolor
96           5.7         3.0          4.2         1.2 versicolor
97           5.7         2.9          4.2         1.3 versicolor
99           5.1         2.5          3.0         1.1 versicolor
100          5.7         2.8          4.1         1.3 versicolor
102          5.8         2.7          5.1         1.9  virginica
107          4.9         2.5          4.5         1.7  virginica
114          5.7         2.5          5.0         2.0  virginica
115          5.8         2.8          5.1         2.4  virginica
122          5.6         2.8          4.9         2.0  virginica
143          5.8         2.7          5.1         1.9  virginica
150          5.9         3.0          5.1         1.8  virginica

Combining Conditions

If we wanted to find the irises that are both setosas and have sepal lengths less than 5, we can do so using the & operator.

iris[iris$Species == "setosa" & iris$Sepal.Length < 5, ]
   Sepal.Length Sepal.Width Petal.Length Petal.Width Species
2           4.9         3.0          1.4         0.2  setosa
3           4.7         3.2          1.3         0.2  setosa
4           4.6         3.1          1.5         0.2  setosa
7           4.6         3.4          1.4         0.3  setosa
9           4.4         2.9          1.4         0.2  setosa
10          4.9         3.1          1.5         0.1  setosa
12          4.8         3.4          1.6         0.2  setosa
13          4.8         3.0          1.4         0.1  setosa
14          4.3         3.0          1.1         0.1  setosa
23          4.6         3.6          1.0         0.2  setosa
25          4.8         3.4          1.9         0.2  setosa
30          4.7         3.2          1.6         0.2  setosa
31          4.8         3.1          1.6         0.2  setosa
35          4.9         3.1          1.5         0.2  setosa
38          4.9         3.6          1.4         0.1  setosa
39          4.4         3.0          1.3         0.2  setosa
42          4.5         2.3          1.3         0.3  setosa
43          4.4         3.2          1.3         0.2  setosa
46          4.8         3.0          1.4         0.3  setosa
48          4.6         3.2          1.4         0.2  setosa

If we wanted to find irises that are setosas or greater than 6, we can do so using the | operator.

iris[iris$Species == "setosa" | iris$Sepal.Length > 6, ]
    Sepal.Length Sepal.Width Petal.Length Petal.Width    Species
1            5.1         3.5          1.4         0.2     setosa
2            4.9         3.0          1.4         0.2     setosa
3            4.7         3.2          1.3         0.2     setosa
4            4.6         3.1          1.5         0.2     setosa
5            5.0         3.6          1.4         0.2     setosa
6            5.4         3.9          1.7         0.4     setosa
7            4.6         3.4          1.4         0.3     setosa
8            5.0         3.4          1.5         0.2     setosa
9            4.4         2.9          1.4         0.2     setosa
10           4.9         3.1          1.5         0.1     setosa
11           5.4         3.7          1.5         0.2     setosa
12           4.8         3.4          1.6         0.2     setosa
13           4.8         3.0          1.4         0.1     setosa
14           4.3         3.0          1.1         0.1     setosa
15           5.8         4.0          1.2         0.2     setosa
16           5.7         4.4          1.5         0.4     setosa
17           5.4         3.9          1.3         0.4     setosa
18           5.1         3.5          1.4         0.3     setosa
19           5.7         3.8          1.7         0.3     setosa
20           5.1         3.8          1.5         0.3     setosa
21           5.4         3.4          1.7         0.2     setosa
22           5.1         3.7          1.5         0.4     setosa
23           4.6         3.6          1.0         0.2     setosa
24           5.1         3.3          1.7         0.5     setosa
25           4.8         3.4          1.9         0.2     setosa
26           5.0         3.0          1.6         0.2     setosa
27           5.0         3.4          1.6         0.4     setosa
28           5.2         3.5          1.5         0.2     setosa
29           5.2         3.4          1.4         0.2     setosa
30           4.7         3.2          1.6         0.2     setosa
31           4.8         3.1          1.6         0.2     setosa
32           5.4         3.4          1.5         0.4     setosa
33           5.2         4.1          1.5         0.1     setosa
34           5.5         4.2          1.4         0.2     setosa
35           4.9         3.1          1.5         0.2     setosa
36           5.0         3.2          1.2         0.2     setosa
37           5.5         3.5          1.3         0.2     setosa
38           4.9         3.6          1.4         0.1     setosa
39           4.4         3.0          1.3         0.2     setosa
40           5.1         3.4          1.5         0.2     setosa
41           5.0         3.5          1.3         0.3     setosa
42           4.5         2.3          1.3         0.3     setosa
43           4.4         3.2          1.3         0.2     setosa
44           5.0         3.5          1.6         0.6     setosa
45           5.1         3.8          1.9         0.4     setosa
46           4.8         3.0          1.4         0.3     setosa
47           5.1         3.8          1.6         0.2     setosa
48           4.6         3.2          1.4         0.2     setosa
49           5.3         3.7          1.5         0.2     setosa
50           5.0         3.3          1.4         0.2     setosa
51           7.0         3.2          4.7         1.4 versicolor
52           6.4         3.2          4.5         1.5 versicolor
53           6.9         3.1          4.9         1.5 versicolor
55           6.5         2.8          4.6         1.5 versicolor
57           6.3         3.3          4.7         1.6 versicolor
59           6.6         2.9          4.6         1.3 versicolor
64           6.1         2.9          4.7         1.4 versicolor
66           6.7         3.1          4.4         1.4 versicolor
69           6.2         2.2          4.5         1.5 versicolor
72           6.1         2.8          4.0         1.3 versicolor
73           6.3         2.5          4.9         1.5 versicolor
74           6.1         2.8          4.7         1.2 versicolor
75           6.4         2.9          4.3         1.3 versicolor
76           6.6         3.0          4.4         1.4 versicolor
77           6.8         2.8          4.8         1.4 versicolor
78           6.7         3.0          5.0         1.7 versicolor
87           6.7         3.1          4.7         1.5 versicolor
88           6.3         2.3          4.4         1.3 versicolor
92           6.1         3.0          4.6         1.4 versicolor
98           6.2         2.9          4.3         1.3 versicolor
101          6.3         3.3          6.0         2.5  virginica
103          7.1         3.0          5.9         2.1  virginica
104          6.3         2.9          5.6         1.8  virginica
105          6.5         3.0          5.8         2.2  virginica
106          7.6         3.0          6.6         2.1  virginica
108          7.3         2.9          6.3         1.8  virginica
109          6.7         2.5          5.8         1.8  virginica
110          7.2         3.6          6.1         2.5  virginica
111          6.5         3.2          5.1         2.0  virginica
112          6.4         2.7          5.3         1.9  virginica
113          6.8         3.0          5.5         2.1  virginica
116          6.4         3.2          5.3         2.3  virginica
117          6.5         3.0          5.5         1.8  virginica
118          7.7         3.8          6.7         2.2  virginica
119          7.7         2.6          6.9         2.3  virginica
121          6.9         3.2          5.7         2.3  virginica
123          7.7         2.8          6.7         2.0  virginica
124          6.3         2.7          4.9         1.8  virginica
125          6.7         3.3          5.7         2.1  virginica
126          7.2         3.2          6.0         1.8  virginica
127          6.2         2.8          4.8         1.8  virginica
128          6.1         3.0          4.9         1.8  virginica
129          6.4         2.8          5.6         2.1  virginica
130          7.2         3.0          5.8         1.6  virginica
131          7.4         2.8          6.1         1.9  virginica
132          7.9         3.8          6.4         2.0  virginica
133          6.4         2.8          5.6         2.2  virginica
134          6.3         2.8          5.1         1.5  virginica
135          6.1         2.6          5.6         1.4  virginica
136          7.7         3.0          6.1         2.3  virginica
137          6.3         3.4          5.6         2.4  virginica
138          6.4         3.1          5.5         1.8  virginica
140          6.9         3.1          5.4         2.1  virginica
141          6.7         3.1          5.6         2.4  virginica
142          6.9         3.1          5.1         2.3  virginica
144          6.8         3.2          5.9         2.3  virginica
145          6.7         3.3          5.7         2.5  virginica
146          6.7         3.0          5.2         2.3  virginica
147          6.3         2.5          5.0         1.9  virginica
148          6.5         3.0          5.2         2.0  virginica
149          6.2         3.4          5.4         2.3  virginica