Nathan Byers
May 30, 2014
airquality
data framedata(airquality)
?airquality
First, the best way to take a quick look at the top of a data frame is to use the head()
function
head(airquality)
Ozone Solar.R Wind Temp Month Day
1 41 190 7.4 67 5 1
2 36 118 8.0 72 5 2
3 12 149 12.6 74 5 3
4 18 313 11.5 62 5 4
5 NA NA 14.3 56 5 5
6 28 NA 14.9 66 5 6
You can specify the number of lines to display by using the n =
parameter
head(airquality, n = 3)
Ozone Solar.R Wind Temp Month Day
1 41 190 7.4 67 5 1
2 36 118 8.0 72 5 2
3 12 149 12.6 74 5 3
You can also look at the bottom of the data frame by using tail()
tail(airquality)
Ozone Solar.R Wind Temp Month Day
148 14 20 16.6 63 9 25
149 30 193 6.9 70 9 26
150 NA 145 13.2 77 9 27
151 14 191 14.3 75 9 28
152 18 131 8.0 76 9 29
153 20 223 11.5 68 9 30
[
function, i.e. bracketsairquality[ , ]
[rows, columns]
To get one row of the data frame, specify the row number you would like in the brackets, on the left side of the comma
airquality[1, ]
Ozone Solar.R Wind Temp Month Day
1 41 190 7.4 67 5 1
If you want more than one row, you can supply a vector of row numbers
airquality[c(1, 2, 3), ]
Ozone Solar.R Wind Temp Month Day
1 41 190 7.4 67 5 1
2 36 118 8.0 72 5 2
3 12 149 12.6 74 5 3
To get a column from the data frame, specify the column number in the brackets, to the right of the comma (it's returned as a vector)
airquality[, 1]
[1] 41 36 12 18 NA 28 23 19 8 NA 7 16 11 14 18 14 34
[18] 6 30 11 1 11 4 32 NA NA NA 23 45 115 37 NA NA NA
[35] NA NA NA 29 NA 71 39 NA NA 23 NA NA 21 37 20 12 13
[52] NA NA NA NA NA NA NA NA NA NA 135 49 32 NA 64 40 77
[69] 97 97 85 NA 10 27 NA 7 48 35 61 79 63 16 NA NA 80
[86] 108 20 52 82 50 64 59 39 9 16 78 35 66 122 89 110 NA
[103] NA 44 28 65 NA 22 59 23 31 44 21 9 NA 45 168 73 NA
[120] 76 118 84 85 96 78 73 91 47 32 20 23 21 24 44 21 28
[137] 9 13 46 18 13 24 16 13 23 36 7 14 30 NA 14 18 20
You can obtain more than one column by supplying a vector of column numbers
airquality[, c(3, 4, 5)]
Wind Temp Month
1 7.4 67 5
2 8.0 72 5
3 12.6 74 5
4 11.5 62 5
5 14.3 56 5
6 14.9 66 5
7 8.6 65 5
8 13.8 59 5
9 20.1 61 5
10 8.6 69 5
11 6.9 74 5
12 9.7 69 5
13 9.2 66 5
14 10.9 68 5
15 13.2 58 5
16 11.5 64 5
17 12.0 66 5
18 18.4 57 5
19 11.5 68 5
20 9.7 62 5
21 9.7 59 5
22 16.6 73 5
23 9.7 61 5
24 12.0 61 5
25 16.6 57 5
26 14.9 58 5
27 8.0 57 5
28 12.0 67 5
29 14.9 81 5
30 5.7 79 5
31 7.4 76 5
32 8.6 78 6
33 9.7 74 6
34 16.1 67 6
35 9.2 84 6
36 8.6 85 6
37 14.3 79 6
38 9.7 82 6
39 6.9 87 6
40 13.8 90 6
41 11.5 87 6
42 10.9 93 6
43 9.2 92 6
44 8.0 82 6
45 13.8 80 6
46 11.5 79 6
47 14.9 77 6
48 20.7 72 6
49 9.2 65 6
50 11.5 73 6
51 10.3 76 6
52 6.3 77 6
53 1.7 76 6
54 4.6 76 6
55 6.3 76 6
56 8.0 75 6
57 8.0 78 6
58 10.3 73 6
59 11.5 80 6
60 14.9 77 6
61 8.0 83 6
62 4.1 84 7
63 9.2 85 7
64 9.2 81 7
65 10.9 84 7
66 4.6 83 7
67 10.9 83 7
68 5.1 88 7
69 6.3 92 7
70 5.7 92 7
71 7.4 89 7
72 8.6 82 7
73 14.3 73 7
74 14.9 81 7
75 14.9 91 7
76 14.3 80 7
77 6.9 81 7
78 10.3 82 7
79 6.3 84 7
80 5.1 87 7
81 11.5 85 7
82 6.9 74 7
83 9.7 81 7
84 11.5 82 7
85 8.6 86 7
86 8.0 85 7
87 8.6 82 7
88 12.0 86 7
89 7.4 88 7
90 7.4 86 7
91 7.4 83 7
92 9.2 81 7
93 6.9 81 8
94 13.8 81 8
95 7.4 82 8
96 6.9 86 8
97 7.4 85 8
98 4.6 87 8
99 4.0 89 8
100 10.3 90 8
101 8.0 90 8
102 8.6 92 8
103 11.5 86 8
104 11.5 86 8
105 11.5 82 8
106 9.7 80 8
107 11.5 79 8
108 10.3 77 8
109 6.3 79 8
110 7.4 76 8
111 10.9 78 8
112 10.3 78 8
113 15.5 77 8
114 14.3 72 8
115 12.6 75 8
116 9.7 79 8
117 3.4 81 8
118 8.0 86 8
119 5.7 88 8
120 9.7 97 8
121 2.3 94 8
122 6.3 96 8
123 6.3 94 8
124 6.9 91 9
125 5.1 92 9
126 2.8 93 9
127 4.6 93 9
128 7.4 87 9
129 15.5 84 9
130 10.9 80 9
131 10.3 78 9
132 10.9 75 9
133 9.7 73 9
134 14.9 81 9
135 15.5 76 9
136 6.3 77 9
137 10.9 71 9
138 11.5 71 9
139 6.9 78 9
140 13.8 67 9
141 10.3 76 9
142 10.3 68 9
143 8.0 82 9
144 12.6 64 9
145 9.2 71 9
146 10.3 81 9
147 10.3 69 9
148 16.6 63 9
149 6.9 70 9
150 13.2 77 9
151 14.3 75 9
152 8.0 76 9
153 11.5 68 9
Column names can also be used
airquality[, "Solar.R"]
[1] 190 118 149 313 NA NA 299 99 19 194 NA 256 290 274 65 334 307
[18] 78 322 44 8 320 25 92 66 266 NA 13 252 223 279 286 287 242
[35] 186 220 264 127 273 291 323 259 250 148 332 322 191 284 37 120 137
[52] 150 59 91 250 135 127 47 98 31 138 269 248 236 101 175 314 276
[69] 267 272 175 139 264 175 291 48 260 274 285 187 220 7 258 295 294
[86] 223 81 82 213 275 253 254 83 24 77 NA NA NA 255 229 207 222
[103] 137 192 273 157 64 71 51 115 244 190 259 36 255 212 238 215 153
[120] 203 225 237 188 167 197 183 189 95 92 252 220 230 259 236 259 238
[137] 24 112 237 224 27 238 201 238 14 139 49 20 193 145 191 131 223
Or a vector of column names
airquality[, c("Ozone", "Temp", "Day")]
Ozone Temp Day
1 41 67 1
2 36 72 2
3 12 74 3
4 18 62 4
5 NA 56 5
6 28 66 6
7 23 65 7
8 19 59 8
9 8 61 9
10 NA 69 10
11 7 74 11
12 16 69 12
13 11 66 13
14 14 68 14
15 18 58 15
16 14 64 16
17 34 66 17
18 6 57 18
19 30 68 19
20 11 62 20
21 1 59 21
22 11 73 22
23 4 61 23
24 32 61 24
25 NA 57 25
26 NA 58 26
27 NA 57 27
28 23 67 28
29 45 81 29
30 115 79 30
31 37 76 31
32 NA 78 1
33 NA 74 2
34 NA 67 3
35 NA 84 4
36 NA 85 5
37 NA 79 6
38 29 82 7
39 NA 87 8
40 71 90 9
41 39 87 10
42 NA 93 11
43 NA 92 12
44 23 82 13
45 NA 80 14
46 NA 79 15
47 21 77 16
48 37 72 17
49 20 65 18
50 12 73 19
51 13 76 20
52 NA 77 21
53 NA 76 22
54 NA 76 23
55 NA 76 24
56 NA 75 25
57 NA 78 26
58 NA 73 27
59 NA 80 28
60 NA 77 29
61 NA 83 30
62 135 84 1
63 49 85 2
64 32 81 3
65 NA 84 4
66 64 83 5
67 40 83 6
68 77 88 7
69 97 92 8
70 97 92 9
71 85 89 10
72 NA 82 11
73 10 73 12
74 27 81 13
75 NA 91 14
76 7 80 15
77 48 81 16
78 35 82 17
79 61 84 18
80 79 87 19
81 63 85 20
82 16 74 21
83 NA 81 22
84 NA 82 23
85 80 86 24
86 108 85 25
87 20 82 26
88 52 86 27
89 82 88 28
90 50 86 29
91 64 83 30
92 59 81 31
93 39 81 1
94 9 81 2
95 16 82 3
96 78 86 4
97 35 85 5
98 66 87 6
99 122 89 7
100 89 90 8
101 110 90 9
102 NA 92 10
103 NA 86 11
104 44 86 12
105 28 82 13
106 65 80 14
107 NA 79 15
108 22 77 16
109 59 79 17
110 23 76 18
111 31 78 19
112 44 78 20
113 21 77 21
114 9 72 22
115 NA 75 23
116 45 79 24
117 168 81 25
118 73 86 26
119 NA 88 27
120 76 97 28
121 118 94 29
122 84 96 30
123 85 94 31
124 96 91 1
125 78 92 2
126 73 93 3
127 91 93 4
128 47 87 5
129 32 84 6
130 20 80 7
131 23 78 8
132 21 75 9
133 24 73 10
134 44 81 11
135 21 76 12
136 28 77 13
137 9 71 14
138 13 71 15
139 46 78 16
140 18 67 17
141 13 76 18
142 24 68 19
143 16 82 20
144 13 64 21
145 23 71 22
146 36 81 23
147 7 69 24
148 14 63 25
149 30 70 26
150 NA 77 27
151 14 75 28
152 18 76 29
153 20 68 30
Both rows and columns can be specified
airquality[1:5, 3:5]
Wind Temp Month
1 7.4 67 5
2 8.0 72 5
3 12.6 74 5
4 11.5 62 5
5 14.3 56 5
airquality[(logical expression), ]
aq <- airquality
aq[(aq$Ozone > 15), ]
Ozone Solar.R Wind Temp Month Day
1 41 190 7.4 67 5 1
2 36 118 8.0 72 5 2
4 18 313 11.5 62 5 4
NA NA NA NA NA NA NA
6 28 NA 14.9 66 5 6
7 23 299 8.6 65 5 7
8 19 99 13.8 59 5 8
NA.1 NA NA NA NA NA NA
12 16 256 9.7 69 5 12
15 18 65 13.2 58 5 15
17 34 307 12.0 66 5 17
19 30 322 11.5 68 5 19
24 32 92 12.0 61 5 24
NA.2 NA NA NA NA NA NA
NA.3 NA NA NA NA NA NA
NA.4 NA NA NA NA NA NA
28 23 13 12.0 67 5 28
29 45 252 14.9 81 5 29
30 115 223 5.7 79 5 30
31 37 279 7.4 76 5 31
NA.5 NA NA NA NA NA NA
NA.6 NA NA NA NA NA NA
NA.7 NA NA NA NA NA NA
NA.8 NA NA NA NA NA NA
NA.9 NA NA NA NA NA NA
NA.10 NA NA NA NA NA NA
38 29 127 9.7 82 6 7
NA.11 NA NA NA NA NA NA
40 71 291 13.8 90 6 9
41 39 323 11.5 87 6 10
NA.12 NA NA NA NA NA NA
NA.13 NA NA NA NA NA NA
44 23 148 8.0 82 6 13
NA.14 NA NA NA NA NA NA
NA.15 NA NA NA NA NA NA
47 21 191 14.9 77 6 16
48 37 284 20.7 72 6 17
49 20 37 9.2 65 6 18
NA.16 NA NA NA NA NA NA
NA.17 NA NA NA NA NA NA
NA.18 NA NA NA NA NA NA
NA.19 NA NA NA NA NA NA
NA.20 NA NA NA NA NA NA
NA.21 NA NA NA NA NA NA
NA.22 NA NA NA NA NA NA
NA.23 NA NA NA NA NA NA
NA.24 NA NA NA NA NA NA
NA.25 NA NA NA NA NA NA
62 135 269 4.1 84 7 1
63 49 248 9.2 85 7 2
64 32 236 9.2 81 7 3
NA.26 NA NA NA NA NA NA
66 64 175 4.6 83 7 5
67 40 314 10.9 83 7 6
68 77 276 5.1 88 7 7
69 97 267 6.3 92 7 8
70 97 272 5.7 92 7 9
71 85 175 7.4 89 7 10
NA.27 NA NA NA NA NA NA
74 27 175 14.9 81 7 13
NA.28 NA NA NA NA NA NA
77 48 260 6.9 81 7 16
78 35 274 10.3 82 7 17
79 61 285 6.3 84 7 18
80 79 187 5.1 87 7 19
81 63 220 11.5 85 7 20
82 16 7 6.9 74 7 21
NA.29 NA NA NA NA NA NA
NA.30 NA NA NA NA NA NA
85 80 294 8.6 86 7 24
86 108 223 8.0 85 7 25
87 20 81 8.6 82 7 26
88 52 82 12.0 86 7 27
89 82 213 7.4 88 7 28
90 50 275 7.4 86 7 29
91 64 253 7.4 83 7 30
92 59 254 9.2 81 7 31
93 39 83 6.9 81 8 1
95 16 77 7.4 82 8 3
96 78 NA 6.9 86 8 4
97 35 NA 7.4 85 8 5
98 66 NA 4.6 87 8 6
99 122 255 4.0 89 8 7
100 89 229 10.3 90 8 8
101 110 207 8.0 90 8 9
NA.31 NA NA NA NA NA NA
NA.32 NA NA NA NA NA NA
104 44 192 11.5 86 8 12
105 28 273 11.5 82 8 13
106 65 157 9.7 80 8 14
NA.33 NA NA NA NA NA NA
108 22 71 10.3 77 8 16
109 59 51 6.3 79 8 17
110 23 115 7.4 76 8 18
111 31 244 10.9 78 8 19
112 44 190 10.3 78 8 20
113 21 259 15.5 77 8 21
NA.34 NA NA NA NA NA NA
116 45 212 9.7 79 8 24
117 168 238 3.4 81 8 25
118 73 215 8.0 86 8 26
NA.35 NA NA NA NA NA NA
120 76 203 9.7 97 8 28
121 118 225 2.3 94 8 29
122 84 237 6.3 96 8 30
123 85 188 6.3 94 8 31
124 96 167 6.9 91 9 1
125 78 197 5.1 92 9 2
126 73 183 2.8 93 9 3
127 91 189 4.6 93 9 4
128 47 95 7.4 87 9 5
129 32 92 15.5 84 9 6
130 20 252 10.9 80 9 7
131 23 220 10.3 78 9 8
132 21 230 10.9 75 9 9
133 24 259 9.7 73 9 10
134 44 236 14.9 81 9 11
135 21 259 15.5 76 9 12
136 28 238 6.3 77 9 13
139 46 237 6.9 78 9 16
140 18 224 13.8 67 9 17
142 24 238 10.3 68 9 19
143 16 201 8.0 82 9 20
145 23 14 9.2 71 9 22
146 36 139 10.3 81 9 23
149 30 193 6.9 70 9 26
NA.36 NA NA NA NA NA NA
152 18 131 8.0 76 9 29
153 20 223 11.5 68 9 30
logical.vector <- aq$Ozone > 15
class(logical.vector)
[1] "logical"
logical.vector[1:5] # return the first 5
[1] TRUE TRUE FALSE TRUE NA
logical.vector
to do the same thingaq[logical.vector, ]
Ozone Solar.R Wind Temp Month Day
1 41 190 7.4 67 5 1
2 36 118 8.0 72 5 2
4 18 313 11.5 62 5 4
NA NA NA NA NA NA NA
6 28 NA 14.9 66 5 6
7 23 299 8.6 65 5 7
8 19 99 13.8 59 5 8
NA.1 NA NA NA NA NA NA
12 16 256 9.7 69 5 12
15 18 65 13.2 58 5 15
17 34 307 12.0 66 5 17
19 30 322 11.5 68 5 19
24 32 92 12.0 61 5 24
NA.2 NA NA NA NA NA NA
NA.3 NA NA NA NA NA NA
NA.4 NA NA NA NA NA NA
28 23 13 12.0 67 5 28
29 45 252 14.9 81 5 29
30 115 223 5.7 79 5 30
31 37 279 7.4 76 5 31
NA.5 NA NA NA NA NA NA
NA.6 NA NA NA NA NA NA
NA.7 NA NA NA NA NA NA
NA.8 NA NA NA NA NA NA
NA.9 NA NA NA NA NA NA
NA.10 NA NA NA NA NA NA
38 29 127 9.7 82 6 7
NA.11 NA NA NA NA NA NA
40 71 291 13.8 90 6 9
41 39 323 11.5 87 6 10
NA.12 NA NA NA NA NA NA
NA.13 NA NA NA NA NA NA
44 23 148 8.0 82 6 13
NA.14 NA NA NA NA NA NA
NA.15 NA NA NA NA NA NA
47 21 191 14.9 77 6 16
48 37 284 20.7 72 6 17
49 20 37 9.2 65 6 18
NA.16 NA NA NA NA NA NA
NA.17 NA NA NA NA NA NA
NA.18 NA NA NA NA NA NA
NA.19 NA NA NA NA NA NA
NA.20 NA NA NA NA NA NA
NA.21 NA NA NA NA NA NA
NA.22 NA NA NA NA NA NA
NA.23 NA NA NA NA NA NA
NA.24 NA NA NA NA NA NA
NA.25 NA NA NA NA NA NA
62 135 269 4.1 84 7 1
63 49 248 9.2 85 7 2
64 32 236 9.2 81 7 3
NA.26 NA NA NA NA NA NA
66 64 175 4.6 83 7 5
67 40 314 10.9 83 7 6
68 77 276 5.1 88 7 7
69 97 267 6.3 92 7 8
70 97 272 5.7 92 7 9
71 85 175 7.4 89 7 10
NA.27 NA NA NA NA NA NA
74 27 175 14.9 81 7 13
NA.28 NA NA NA NA NA NA
77 48 260 6.9 81 7 16
78 35 274 10.3 82 7 17
79 61 285 6.3 84 7 18
80 79 187 5.1 87 7 19
81 63 220 11.5 85 7 20
82 16 7 6.9 74 7 21
NA.29 NA NA NA NA NA NA
NA.30 NA NA NA NA NA NA
85 80 294 8.6 86 7 24
86 108 223 8.0 85 7 25
87 20 81 8.6 82 7 26
88 52 82 12.0 86 7 27
89 82 213 7.4 88 7 28
90 50 275 7.4 86 7 29
91 64 253 7.4 83 7 30
92 59 254 9.2 81 7 31
93 39 83 6.9 81 8 1
95 16 77 7.4 82 8 3
96 78 NA 6.9 86 8 4
97 35 NA 7.4 85 8 5
98 66 NA 4.6 87 8 6
99 122 255 4.0 89 8 7
100 89 229 10.3 90 8 8
101 110 207 8.0 90 8 9
NA.31 NA NA NA NA NA NA
NA.32 NA NA NA NA NA NA
104 44 192 11.5 86 8 12
105 28 273 11.5 82 8 13
106 65 157 9.7 80 8 14
NA.33 NA NA NA NA NA NA
108 22 71 10.3 77 8 16
109 59 51 6.3 79 8 17
110 23 115 7.4 76 8 18
111 31 244 10.9 78 8 19
112 44 190 10.3 78 8 20
113 21 259 15.5 77 8 21
NA.34 NA NA NA NA NA NA
116 45 212 9.7 79 8 24
117 168 238 3.4 81 8 25
118 73 215 8.0 86 8 26
NA.35 NA NA NA NA NA NA
120 76 203 9.7 97 8 28
121 118 225 2.3 94 8 29
122 84 237 6.3 96 8 30
123 85 188 6.3 94 8 31
124 96 167 6.9 91 9 1
125 78 197 5.1 92 9 2
126 73 183 2.8 93 9 3
127 91 189 4.6 93 9 4
128 47 95 7.4 87 9 5
129 32 92 15.5 84 9 6
130 20 252 10.9 80 9 7
131 23 220 10.3 78 9 8
132 21 230 10.9 75 9 9
133 24 259 9.7 73 9 10
134 44 236 14.9 81 9 11
135 21 259 15.5 76 9 12
136 28 238 6.3 77 9 13
139 46 237 6.9 78 9 16
140 18 224 13.8 67 9 17
142 24 238 10.3 68 9 19
143 16 201 8.0 82 9 20
145 23 14 9.2 71 9 22
146 36 139 10.3 81 9 23
149 30 193 6.9 70 9 26
NA.36 NA NA NA NA NA NA
152 18 131 8.0 76 9 29
153 20 223 11.5 68 9 30
If we wanted all of the days in the 7th month, we could use ==
aq[(aq$Month == 7), ]
Ozone Solar.R Wind Temp Month Day
62 135 269 4.1 84 7 1
63 49 248 9.2 85 7 2
64 32 236 9.2 81 7 3
65 NA 101 10.9 84 7 4
66 64 175 4.6 83 7 5
67 40 314 10.9 83 7 6
68 77 276 5.1 88 7 7
69 97 267 6.3 92 7 8
70 97 272 5.7 92 7 9
71 85 175 7.4 89 7 10
72 NA 139 8.6 82 7 11
73 10 264 14.3 73 7 12
74 27 175 14.9 81 7 13
75 NA 291 14.9 91 7 14
76 7 48 14.3 80 7 15
77 48 260 6.9 81 7 16
78 35 274 10.3 82 7 17
79 61 285 6.3 84 7 18
80 79 187 5.1 87 7 19
81 63 220 11.5 85 7 20
82 16 7 6.9 74 7 21
83 NA 258 9.7 81 7 22
84 NA 295 11.5 82 7 23
85 80 294 8.6 86 7 24
86 108 223 8.0 85 7 25
87 20 81 8.6 82 7 26
88 52 82 12.0 86 7 27
89 82 213 7.4 88 7 28
90 50 275 7.4 86 7 29
91 64 253 7.4 83 7 30
92 59 254 9.2 81 7 31
Or if we want all days except the 6th day, use !=
aq[(aq$Day != 6), ]
Ozone Solar.R Wind Temp Month Day
1 41 190 7.4 67 5 1
2 36 118 8.0 72 5 2
3 12 149 12.6 74 5 3
4 18 313 11.5 62 5 4
5 NA NA 14.3 56 5 5
7 23 299 8.6 65 5 7
8 19 99 13.8 59 5 8
9 8 19 20.1 61 5 9
10 NA 194 8.6 69 5 10
11 7 NA 6.9 74 5 11
12 16 256 9.7 69 5 12
13 11 290 9.2 66 5 13
14 14 274 10.9 68 5 14
15 18 65 13.2 58 5 15
16 14 334 11.5 64 5 16
17 34 307 12.0 66 5 17
18 6 78 18.4 57 5 18
19 30 322 11.5 68 5 19
20 11 44 9.7 62 5 20
21 1 8 9.7 59 5 21
22 11 320 16.6 73 5 22
23 4 25 9.7 61 5 23
24 32 92 12.0 61 5 24
25 NA 66 16.6 57 5 25
26 NA 266 14.9 58 5 26
27 NA NA 8.0 57 5 27
28 23 13 12.0 67 5 28
29 45 252 14.9 81 5 29
30 115 223 5.7 79 5 30
31 37 279 7.4 76 5 31
32 NA 286 8.6 78 6 1
33 NA 287 9.7 74 6 2
34 NA 242 16.1 67 6 3
35 NA 186 9.2 84 6 4
36 NA 220 8.6 85 6 5
38 29 127 9.7 82 6 7
39 NA 273 6.9 87 6 8
40 71 291 13.8 90 6 9
41 39 323 11.5 87 6 10
42 NA 259 10.9 93 6 11
43 NA 250 9.2 92 6 12
44 23 148 8.0 82 6 13
45 NA 332 13.8 80 6 14
46 NA 322 11.5 79 6 15
47 21 191 14.9 77 6 16
48 37 284 20.7 72 6 17
49 20 37 9.2 65 6 18
50 12 120 11.5 73 6 19
51 13 137 10.3 76 6 20
52 NA 150 6.3 77 6 21
53 NA 59 1.7 76 6 22
54 NA 91 4.6 76 6 23
55 NA 250 6.3 76 6 24
56 NA 135 8.0 75 6 25
57 NA 127 8.0 78 6 26
58 NA 47 10.3 73 6 27
59 NA 98 11.5 80 6 28
60 NA 31 14.9 77 6 29
61 NA 138 8.0 83 6 30
62 135 269 4.1 84 7 1
63 49 248 9.2 85 7 2
64 32 236 9.2 81 7 3
65 NA 101 10.9 84 7 4
66 64 175 4.6 83 7 5
68 77 276 5.1 88 7 7
69 97 267 6.3 92 7 8
70 97 272 5.7 92 7 9
71 85 175 7.4 89 7 10
72 NA 139 8.6 82 7 11
73 10 264 14.3 73 7 12
74 27 175 14.9 81 7 13
75 NA 291 14.9 91 7 14
76 7 48 14.3 80 7 15
77 48 260 6.9 81 7 16
78 35 274 10.3 82 7 17
79 61 285 6.3 84 7 18
80 79 187 5.1 87 7 19
81 63 220 11.5 85 7 20
82 16 7 6.9 74 7 21
83 NA 258 9.7 81 7 22
84 NA 295 11.5 82 7 23
85 80 294 8.6 86 7 24
86 108 223 8.0 85 7 25
87 20 81 8.6 82 7 26
88 52 82 12.0 86 7 27
89 82 213 7.4 88 7 28
90 50 275 7.4 86 7 29
91 64 253 7.4 83 7 30
92 59 254 9.2 81 7 31
93 39 83 6.9 81 8 1
94 9 24 13.8 81 8 2
95 16 77 7.4 82 8 3
96 78 NA 6.9 86 8 4
97 35 NA 7.4 85 8 5
99 122 255 4.0 89 8 7
100 89 229 10.3 90 8 8
101 110 207 8.0 90 8 9
102 NA 222 8.6 92 8 10
103 NA 137 11.5 86 8 11
104 44 192 11.5 86 8 12
105 28 273 11.5 82 8 13
106 65 157 9.7 80 8 14
107 NA 64 11.5 79 8 15
108 22 71 10.3 77 8 16
109 59 51 6.3 79 8 17
110 23 115 7.4 76 8 18
111 31 244 10.9 78 8 19
112 44 190 10.3 78 8 20
113 21 259 15.5 77 8 21
114 9 36 14.3 72 8 22
115 NA 255 12.6 75 8 23
116 45 212 9.7 79 8 24
117 168 238 3.4 81 8 25
118 73 215 8.0 86 8 26
119 NA 153 5.7 88 8 27
120 76 203 9.7 97 8 28
121 118 225 2.3 94 8 29
122 84 237 6.3 96 8 30
123 85 188 6.3 94 8 31
124 96 167 6.9 91 9 1
125 78 197 5.1 92 9 2
126 73 183 2.8 93 9 3
127 91 189 4.6 93 9 4
128 47 95 7.4 87 9 5
130 20 252 10.9 80 9 7
131 23 220 10.3 78 9 8
132 21 230 10.9 75 9 9
133 24 259 9.7 73 9 10
134 44 236 14.9 81 9 11
135 21 259 15.5 76 9 12
136 28 238 6.3 77 9 13
137 9 24 10.9 71 9 14
138 13 112 11.5 71 9 15
139 46 237 6.9 78 9 16
140 18 224 13.8 67 9 17
141 13 27 10.3 76 9 18
142 24 238 10.3 68 9 19
143 16 201 8.0 82 9 20
144 13 238 12.6 64 9 21
145 23 14 9.2 71 9 22
146 36 139 10.3 81 9 23
147 7 49 10.3 69 9 24
148 14 20 16.6 63 9 25
149 30 193 6.9 70 9 26
150 NA 145 13.2 77 9 27
151 14 191 14.3 75 9 28
152 18 131 8.0 76 9 29
153 20 223 11.5 68 9 30
&
(and operator) aq[(aq$Temp >= 65 & aq$Temp <= 70), ]
Ozone Solar.R Wind Temp Month Day
1 41 190 7.4 67 5 1
6 28 NA 14.9 66 5 6
7 23 299 8.6 65 5 7
10 NA 194 8.6 69 5 10
12 16 256 9.7 69 5 12
13 11 290 9.2 66 5 13
14 14 274 10.9 68 5 14
17 34 307 12.0 66 5 17
19 30 322 11.5 68 5 19
28 23 13 12.0 67 5 28
34 NA 242 16.1 67 6 3
49 20 37 9.2 65 6 18
140 18 224 13.8 67 9 17
142 24 238 10.3 68 9 19
147 7 49 10.3 69 9 24
149 30 193 6.9 70 9 26
153 20 223 11.5 68 9 30
|
aq[(aq$Day == 3 | aq$Day == 5),]
Ozone Solar.R Wind Temp Month Day
3 12 149 12.6 74 5 3
5 NA NA 14.3 56 5 5
34 NA 242 16.1 67 6 3
36 NA 220 8.6 85 6 5
64 32 236 9.2 81 7 3
66 64 175 4.6 83 7 5
95 16 77 7.4 82 8 3
97 35 NA 7.4 85 8 5
126 73 183 2.8 93 9 3
128 47 95 7.4 87 9 5
subset()
function subset(aq, Temp > 80)
Ozone Solar.R Wind Temp Month Day
29 45 252 14.9 81 5 29
35 NA 186 9.2 84 6 4
36 NA 220 8.6 85 6 5
38 29 127 9.7 82 6 7
39 NA 273 6.9 87 6 8
40 71 291 13.8 90 6 9
41 39 323 11.5 87 6 10
42 NA 259 10.9 93 6 11
43 NA 250 9.2 92 6 12
44 23 148 8.0 82 6 13
61 NA 138 8.0 83 6 30
62 135 269 4.1 84 7 1
63 49 248 9.2 85 7 2
64 32 236 9.2 81 7 3
65 NA 101 10.9 84 7 4
66 64 175 4.6 83 7 5
67 40 314 10.9 83 7 6
68 77 276 5.1 88 7 7
69 97 267 6.3 92 7 8
70 97 272 5.7 92 7 9
71 85 175 7.4 89 7 10
72 NA 139 8.6 82 7 11
74 27 175 14.9 81 7 13
75 NA 291 14.9 91 7 14
77 48 260 6.9 81 7 16
78 35 274 10.3 82 7 17
79 61 285 6.3 84 7 18
80 79 187 5.1 87 7 19
81 63 220 11.5 85 7 20
83 NA 258 9.7 81 7 22
84 NA 295 11.5 82 7 23
85 80 294 8.6 86 7 24
86 108 223 8.0 85 7 25
87 20 81 8.6 82 7 26
88 52 82 12.0 86 7 27
89 82 213 7.4 88 7 28
90 50 275 7.4 86 7 29
91 64 253 7.4 83 7 30
92 59 254 9.2 81 7 31
93 39 83 6.9 81 8 1
94 9 24 13.8 81 8 2
95 16 77 7.4 82 8 3
96 78 NA 6.9 86 8 4
97 35 NA 7.4 85 8 5
98 66 NA 4.6 87 8 6
99 122 255 4.0 89 8 7
100 89 229 10.3 90 8 8
101 110 207 8.0 90 8 9
102 NA 222 8.6 92 8 10
103 NA 137 11.5 86 8 11
104 44 192 11.5 86 8 12
105 28 273 11.5 82 8 13
117 168 238 3.4 81 8 25
118 73 215 8.0 86 8 26
119 NA 153 5.7 88 8 27
120 76 203 9.7 97 8 28
121 118 225 2.3 94 8 29
122 84 237 6.3 96 8 30
123 85 188 6.3 94 8 31
124 96 167 6.9 91 9 1
125 78 197 5.1 92 9 2
126 73 183 2.8 93 9 3
127 91 189 4.6 93 9 4
128 47 95 7.4 87 9 5
129 32 92 15.5 84 9 6
134 44 236 14.9 81 9 11
143 16 201 8.0 82 9 20
146 36 139 10.3 81 9 23
By using the select =
parameter you can specify which columns to keep
subset(aq, Temp > 80, select = c(Ozone, Temp))
Ozone Temp
29 45 81
35 NA 84
36 NA 85
38 29 82
39 NA 87
40 71 90
41 39 87
42 NA 93
43 NA 92
44 23 82
61 NA 83
62 135 84
63 49 85
64 32 81
65 NA 84
66 64 83
67 40 83
68 77 88
69 97 92
70 97 92
71 85 89
72 NA 82
74 27 81
75 NA 91
77 48 81
78 35 82
79 61 84
80 79 87
81 63 85
83 NA 81
84 NA 82
85 80 86
86 108 85
87 20 82
88 52 86
89 82 88
90 50 86
91 64 83
92 59 81
93 39 81
94 9 81
95 16 82
96 78 86
97 35 85
98 66 87
99 122 89
100 89 90
101 110 90
102 NA 92
103 NA 86
104 44 86
105 28 82
117 168 81
118 73 86
119 NA 88
120 76 97
121 118 94
122 84 96
123 85 94
124 96 91
125 78 92
126 73 93
127 91 93
128 47 87
129 32 84
134 44 81
143 16 82
146 36 81
rbind()
is a way to add rows to a data frameLets take two subsets of the air quality data frame and rbind
them
aq1 <- aq[1:2, ]
aq2 <- aq[35:37, ]
aq.stacked <- rbind(aq1, aq2)
aq.stacked
Ozone Solar.R Wind Temp Month Day
1 41 190 7.4 67 5 1
2 36 118 8.0 72 5 2
35 NA 186 9.2 84 6 4
36 NA 220 8.6 85 6 5
37 NA 264 14.3 79 6 6
If the number of columns aren't the same, it won't work
aq1.chopped <- aq1[, 1:3]
aq1.chopped
Ozone Solar.R Wind
1 41 190 7.4
2 36 118 8.0
rbind(aq1.chopped, aq2) # gives an error
If the column names aren't the same, it won't work
names(aq1)[1] <- "nuhuh"
aq1
nuhuh Solar.R Wind Temp Month Day
1 41 190 7.4 67 5 1
2 36 118 8.0 72 5 2
rbind(aq1, aq2) # gives an error
cbind()
is a way to add columns to a data frameNow we'll split the data frame by columns and cbind
them
aq.L <- aq[, 1:2]
aq.R <- aq[, 4:5]
aq.bound <- cbind(aq.L, aq.R)
aq.bound
Ozone Solar.R Temp Month
1 41 190 67 5
2 36 118 72 5
3 12 149 74 5
4 18 313 62 5
5 NA NA 56 5
6 28 NA 66 5
7 23 299 65 5
8 19 99 59 5
9 8 19 61 5
10 NA 194 69 5
11 7 NA 74 5
12 16 256 69 5
13 11 290 66 5
14 14 274 68 5
15 18 65 58 5
16 14 334 64 5
17 34 307 66 5
18 6 78 57 5
19 30 322 68 5
20 11 44 62 5
21 1 8 59 5
22 11 320 73 5
23 4 25 61 5
24 32 92 61 5
25 NA 66 57 5
26 NA 266 58 5
27 NA NA 57 5
28 23 13 67 5
29 45 252 81 5
30 115 223 79 5
31 37 279 76 5
32 NA 286 78 6
33 NA 287 74 6
34 NA 242 67 6
35 NA 186 84 6
36 NA 220 85 6
37 NA 264 79 6
38 29 127 82 6
39 NA 273 87 6
40 71 291 90 6
41 39 323 87 6
42 NA 259 93 6
43 NA 250 92 6
44 23 148 82 6
45 NA 332 80 6
46 NA 322 79 6
47 21 191 77 6
48 37 284 72 6
49 20 37 65 6
50 12 120 73 6
51 13 137 76 6
52 NA 150 77 6
53 NA 59 76 6
54 NA 91 76 6
55 NA 250 76 6
56 NA 135 75 6
57 NA 127 78 6
58 NA 47 73 6
59 NA 98 80 6
60 NA 31 77 6
61 NA 138 83 6
62 135 269 84 7
63 49 248 85 7
64 32 236 81 7
65 NA 101 84 7
66 64 175 83 7
67 40 314 83 7
68 77 276 88 7
69 97 267 92 7
70 97 272 92 7
71 85 175 89 7
72 NA 139 82 7
73 10 264 73 7
74 27 175 81 7
75 NA 291 91 7
76 7 48 80 7
77 48 260 81 7
78 35 274 82 7
79 61 285 84 7
80 79 187 87 7
81 63 220 85 7
82 16 7 74 7
83 NA 258 81 7
84 NA 295 82 7
85 80 294 86 7
86 108 223 85 7
87 20 81 82 7
88 52 82 86 7
89 82 213 88 7
90 50 275 86 7
91 64 253 83 7
92 59 254 81 7
93 39 83 81 8
94 9 24 81 8
95 16 77 82 8
96 78 NA 86 8
97 35 NA 85 8
98 66 NA 87 8
99 122 255 89 8
100 89 229 90 8
101 110 207 90 8
102 NA 222 92 8
103 NA 137 86 8
104 44 192 86 8
105 28 273 82 8
106 65 157 80 8
107 NA 64 79 8
108 22 71 77 8
109 59 51 79 8
110 23 115 76 8
111 31 244 78 8
112 44 190 78 8
113 21 259 77 8
114 9 36 72 8
115 NA 255 75 8
116 45 212 79 8
117 168 238 81 8
118 73 215 86 8
119 NA 153 88 8
120 76 203 97 8
121 118 225 94 8
122 84 237 96 8
123 85 188 94 8
124 96 167 91 9
125 78 197 92 9
126 73 183 93 9
127 91 189 93 9
128 47 95 87 9
129 32 92 84 9
130 20 252 80 9
131 23 220 78 9
132 21 230 75 9
133 24 259 73 9
134 44 236 81 9
135 21 259 76 9
136 28 238 77 9
137 9 24 71 9
138 13 112 71 9
139 46 237 78 9
140 18 224 67 9
141 13 27 76 9
142 24 238 68 9
143 16 201 82 9
144 13 238 64 9
145 23 14 71 9
146 36 139 81 9
147 7 49 69 9
148 14 20 63 9
149 30 193 70 9
150 NA 145 77 9
151 14 191 75 9
152 18 131 76 9
153 20 223 68 9
If both data frames don't have the same number of rows, it won't work
cbind(aq.L[1:10, ], aq.R) # gives an error
merge()
allows you to combine two data frames that may not have the same dimensions or column names?merge()
for detailsHere's a data frame with three air monitor locations
monitors <- read.table(header=T, text='
monitorid lat long
1 42.467573 -87.810047
2 42.049148 -88.273029
3 39.110539 -90.324080
')
And here's a data frame with pollutant and monitor information
pollutants <- read.table(header=T, text='
pollutant duration monitorid
ozone 1h 1
so2 1h 1
ozone 8h 2
no2 1h 4
')
merge(monitors, pollutants)
monitorid lat long pollutant duration
1 1 42.47 -87.81 ozone 1h
2 1 42.47 -87.81 so2 1h
3 2 42.05 -88.27 ozone 8h
monitorid
values are keptIf you wanted to keep all of the monitor locations (“left join”), you would use the all.x = T
parameter
merge(monitors, pollutants, all.x = T)
monitorid lat long pollutant duration
1 1 42.47 -87.81 ozone 1h
2 1 42.47 -87.81 so2 1h
3 2 42.05 -88.27 ozone 8h
4 3 39.11 -90.32 <NA> <NA>
If you want to keep all of the rows in the pollutant data frame (“right join”), you can use all.y =
merge(monitors, pollutants, all.y = T)
monitorid lat long pollutant duration
1 1 42.47 -87.81 ozone 1h
2 1 42.47 -87.81 so2 1h
3 2 42.05 -88.27 ozone 8h
4 4 NA NA no2 1h
If you wanted to keep all of the information, you can do an “outer join” using all = T
merge(monitors, pollutants, all = T)
monitorid lat long pollutant duration
1 1 42.47 -87.81 ozone 1h
2 1 42.47 -87.81 so2 1h
3 2 42.05 -88.27 ozone 8h
4 3 39.11 -90.32 <NA> <NA>
5 4 NA NA no2 1h
reshape2
packagewide <- read.table(header=T, text='
day temp ozone so2 no2
1 75 50 10 2
2 81 61 25 12
3 68 42 22 3
')
melt()
function from the reshape2
packagelibrary(reshape2)
melt(wide, id.vars = c("day"))
day variable value
1 1 temp 75
2 2 temp 81
3 3 temp 68
4 1 ozone 50
5 2 ozone 61
6 3 ozone 42
7 1 so2 10
8 2 so2 25
9 3 so2 22
10 1 no2 2
11 2 no2 12
12 3 no2 3
If we had a table with more than one monitor, with a separate monitor
column, we would include it as an id variable
wide <- read.table(header=T, text='
day monitor temp ozone so2 no2
1 1 75 50 10 2
2 1 81 61 25 12
3 1 68 42 22 3
1 2 72 51 10 3
2 2 85 63 25 13
3 2 66 45 22 2
')
melt(wide, id.vars = c("day", "monitor"))
day monitor variable value
1 1 1 temp 75
2 2 1 temp 81
3 3 1 temp 68
4 1 2 temp 72
5 2 2 temp 85
6 3 2 temp 66
7 1 1 ozone 50
8 2 1 ozone 61
9 3 1 ozone 42
10 1 2 ozone 51
11 2 2 ozone 63
12 3 2 ozone 45
13 1 1 so2 10
14 2 1 so2 25
15 3 1 so2 22
16 1 2 so2 10
17 2 2 so2 25
18 3 2 so2 22
19 1 1 no2 2
20 2 1 no2 12
21 3 1 no2 3
22 1 2 no2 3
23 2 2 no2 13
24 3 2 no2 2
dcast()
functionlong <- melt(wide, id.vars = c("day", "monitor"))
day
and monitor
the samevariable
is the column that will contain the names of the new columnsvalue
contains the measurementsdcast(long, day + monitor ~ variable, value.var="value")
day monitor temp ozone so2 no2
1 1 1 75 50 10 2
2 1 2 72 51 10 3
3 2 1 81 61 25 12
4 2 2 85 63 25 13
5 3 1 68 42 22 3
6 3 2 66 45 22 2