Extract Data
bike.csv <- read.csv("bike_sharing_data(1).csv")
Preview Data
head(bike.csv)
row(bike.csv)
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13]
[1,] 1 1 1 1 1 1 1 1 1 1 1 1 1
[2,] 2 2 2 2 2 2 2 2 2 2 2 2 2
[3,] 3 3 3 3 3 3 3 3 3 3 3 3 3
[4,] 4 4 4 4 4 4 4 4 4 4 4 4 4
[5,] 5 5 5 5 5 5 5 5 5 5 5 5 5
[6,] 6 6 6 6 6 6 6 6 6 6 6 6 6
[7,] 7 7 7 7 7 7 7 7 7 7 7 7 7
[8,] 8 8 8 8 8 8 8 8 8 8 8 8 8
[9,] 9 9 9 9 9 9 9 9 9 9 9 9 9
[10,] 10 10 10 10 10 10 10 10 10 10 10 10 10
[11,] 11 11 11 11 11 11 11 11 11 11 11 11 11
[12,] 12 12 12 12 12 12 12 12 12 12 12 12 12
[13,] 13 13 13 13 13 13 13 13 13 13 13 13 13
[14,] 14 14 14 14 14 14 14 14 14 14 14 14 14
[15,] 15 15 15 15 15 15 15 15 15 15 15 15 15
[16,] 16 16 16 16 16 16 16 16 16 16 16 16 16
[17,] 17 17 17 17 17 17 17 17 17 17 17 17 17
[18,] 18 18 18 18 18 18 18 18 18 18 18 18 18
[19,] 19 19 19 19 19 19 19 19 19 19 19 19 19
[20,] 20 20 20 20 20 20 20 20 20 20 20 20 20
[21,] 21 21 21 21 21 21 21 21 21 21 21 21 21
[22,] 22 22 22 22 22 22 22 22 22 22 22 22 22
[23,] 23 23 23 23 23 23 23 23 23 23 23 23 23
[24,] 24 24 24 24 24 24 24 24 24 24 24 24 24
[25,] 25 25 25 25 25 25 25 25 25 25 25 25 25
[26,] 26 26 26 26 26 26 26 26 26 26 26 26 26
[27,] 27 27 27 27 27 27 27 27 27 27 27 27 27
[28,] 28 28 28 28 28 28 28 28 28 28 28 28 28
[29,] 29 29 29 29 29 29 29 29 29 29 29 29 29
[30,] 30 30 30 30 30 30 30 30 30 30 30 30 30
[31,] 31 31 31 31 31 31 31 31 31 31 31 31 31
[32,] 32 32 32 32 32 32 32 32 32 32 32 32 32
[33,] 33 33 33 33 33 33 33 33 33 33 33 33 33
[34,] 34 34 34 34 34 34 34 34 34 34 34 34 34
[35,] 35 35 35 35 35 35 35 35 35 35 35 35 35
[36,] 36 36 36 36 36 36 36 36 36 36 36 36 36
[37,] 37 37 37 37 37 37 37 37 37 37 37 37 37
[38,] 38 38 38 38 38 38 38 38 38 38 38 38 38
[39,] 39 39 39 39 39 39 39 39 39 39 39 39 39
[40,] 40 40 40 40 40 40 40 40 40 40 40 40 40
[41,] 41 41 41 41 41 41 41 41 41 41 41 41 41
[42,] 42 42 42 42 42 42 42 42 42 42 42 42 42
[43,] 43 43 43 43 43 43 43 43 43 43 43 43 43
[44,] 44 44 44 44 44 44 44 44 44 44 44 44 44
[45,] 45 45 45 45 45 45 45 45 45 45 45 45 45
[46,] 46 46 46 46 46 46 46 46 46 46 46 46 46
[47,] 47 47 47 47 47 47 47 47 47 47 47 47 47
[48,] 48 48 48 48 48 48 48 48 48 48 48 48 48
[49,] 49 49 49 49 49 49 49 49 49 49 49 49 49
[50,] 50 50 50 50 50 50 50 50 50 50 50 50 50
[51,] 51 51 51 51 51 51 51 51 51 51 51 51 51
[52,] 52 52 52 52 52 52 52 52 52 52 52 52 52
[53,] 53 53 53 53 53 53 53 53 53 53 53 53 53
[54,] 54 54 54 54 54 54 54 54 54 54 54 54 54
[55,] 55 55 55 55 55 55 55 55 55 55 55 55 55
[56,] 56 56 56 56 56 56 56 56 56 56 56 56 56
[57,] 57 57 57 57 57 57 57 57 57 57 57 57 57
[58,] 58 58 58 58 58 58 58 58 58 58 58 58 58
[59,] 59 59 59 59 59 59 59 59 59 59 59 59 59
[60,] 60 60 60 60 60 60 60 60 60 60 60 60 60
[61,] 61 61 61 61 61 61 61 61 61 61 61 61 61
[62,] 62 62 62 62 62 62 62 62 62 62 62 62 62
[63,] 63 63 63 63 63 63 63 63 63 63 63 63 63
[64,] 64 64 64 64 64 64 64 64 64 64 64 64 64
[65,] 65 65 65 65 65 65 65 65 65 65 65 65 65
[66,] 66 66 66 66 66 66 66 66 66 66 66 66 66
[67,] 67 67 67 67 67 67 67 67 67 67 67 67 67
[68,] 68 68 68 68 68 68 68 68 68 68 68 68 68
[69,] 69 69 69 69 69 69 69 69 69 69 69 69 69
[70,] 70 70 70 70 70 70 70 70 70 70 70 70 70
[71,] 71 71 71 71 71 71 71 71 71 71 71 71 71
[72,] 72 72 72 72 72 72 72 72 72 72 72 72 72
[73,] 73 73 73 73 73 73 73 73 73 73 73 73 73
[74,] 74 74 74 74 74 74 74 74 74 74 74 74 74
[75,] 75 75 75 75 75 75 75 75 75 75 75 75 75
[76,] 76 76 76 76 76 76 76 76 76 76 76 76 76
[ reached getOption("max.print") -- omitted 17303 rows ]
Describe the Data
str(bike.csv)
'data.frame': 17379 obs. of 13 variables:
$ datetime : chr "1/1/2011 0:00" "1/1/2011 1:00" "1/1/2011 2:00" "1/1/2011 3:00" ...
$ season : int 1 1 1 1 1 1 1 1 1 1 ...
$ holiday : int 0 0 0 0 0 0 0 0 0 0 ...
$ workingday: int 0 0 0 0 0 0 0 0 0 0 ...
$ weather : int 1 1 1 1 1 2 1 1 1 1 ...
$ temp : num 9.84 9.02 9.02 9.84 9.84 ...
$ atemp : num 14.4 13.6 13.6 14.4 14.4 ...
$ humidity : chr "81" "80" "80" "75" ...
$ windspeed : num 0 0 0 0 0 ...
$ casual : int 3 8 5 3 0 0 2 1 1 8 ...
$ registered: int 13 32 27 10 1 1 0 2 7 6 ...
$ count : int 16 40 32 13 1 1 2 3 8 14 ...
$ sources : chr "ad campaign" "www.yahoo.com" "www.google.fi" "AD campaign" ...
summary(bike.csv)
datetime season holiday workingday weather
Length:17379 Min. :1.000 Min. :0.00000 Min. :0.0000 Min. :1.000
Class :character 1st Qu.:2.000 1st Qu.:0.00000 1st Qu.:0.0000 1st Qu.:1.000
Mode :character Median :3.000 Median :0.00000 Median :1.0000 Median :1.000
Mean :2.502 Mean :0.02877 Mean :0.6827 Mean :1.425
3rd Qu.:3.000 3rd Qu.:0.00000 3rd Qu.:1.0000 3rd Qu.:2.000
Max. :4.000 Max. :1.00000 Max. :1.0000 Max. :4.000
temp atemp humidity windspeed casual
Min. : 0.82 Min. : 0.00 Length:17379 Min. : 0.000 Min. : 0.00
1st Qu.:13.94 1st Qu.:16.66 Class :character 1st Qu.: 7.002 1st Qu.: 4.00
Median :20.50 Median :24.24 Mode :character Median :12.998 Median : 16.00
Mean :20.38 Mean :23.79 Mean :12.737 Mean : 34.48
3rd Qu.:27.06 3rd Qu.:31.06 3rd Qu.:16.998 3rd Qu.: 46.00
Max. :41.00 Max. :50.00 Max. :56.997 Max. :367.00
registered count sources
Min. : 0.0 Min. : 1 Length:17379
1st Qu.: 36.0 1st Qu.: 42 Class :character
Median :116.0 Median :141 Mode :character
Mean :152.5 Mean :187
3rd Qu.:217.0 3rd Qu.:277
Max. :886.0 Max. :977
Contingency Table for Humidity
sort(table(bike.csv$humidity), decreasing=TRUE)
88 83 94 87 70 66 65 69 55 74 77 61 93 49 78 62 73 46 52 56 82
657 630 560 488 430 388 387 359 352 341 336 335 331 327 327 325 317 316 312 310 299
41 54 81 59 100 43 53 60 50 51 58 45 47 44 48 89 79 42 57 37 40
290 287 275 272 270 270 267 267 266 262 258 248 247 244 240 239 238 235 231 224 224
75 64 76 39 71 72 36 38 68 35 63 33 67 34 84 31 30 80 29 32 28
222 219 219 209 193 191 187 186 172 163 163 162 161 133 124 118 113 107 106 99 97
26 86 27 25 24 23 22 21 0 20 19 16 17 18 90 85 15 96 14 92 10
78 76 71 59 56 46 27 26 22 17 16 10 10 10 7 5 4 3 2 2 1
12 13 8 91 97 x61
1 1 1 1 1 1
Find the season based on the value
season_value <- bike.csv$season[6251]
cat("The value of season in row 6251 is: ", season_value, "\n")
The value of season in row 6251 is: 4
Find how many observations are in Winter
table(bike.csv$season)
1 2 3 4
4242 4409 4496 4232
Observations having “High” wind thread condition or above during winter or spring
bike.csv[(bike.csv$season %in% c(1, 4)) & (bike.csv$windspeed >+ 40), ]