Questions 1-3: Conceptual
Question 4:
# Option 1:
bike <- read.table("bike_sharing_data.csv", sep=",", header=TRUE)
# Option 2:
bike1 <- read.csv("bike_sharing_data.csv")
# Option 3:
bike2 <- read.table("bike_sharing_data.txt", sep="\t", header=TRUE)
# Option 4:
bike3 <- read.delim("bike_sharing_data.txt")
Questions 5 and 6:
str(bike)
## 'data.frame': 17379 obs. of 13 variables:
## $ datetime : chr "1/1/2011 0:00" "1/1/2011 1:00" "1/1/2011 2:00" "1/1/2011 3:00" ...
## $ season : int 1 1 1 1 1 1 1 1 1 1 ...
## $ holiday : int 0 0 0 0 0 0 0 0 0 0 ...
## $ workingday: int 0 0 0 0 0 0 0 0 0 0 ...
## $ weather : int 1 1 1 1 1 2 1 1 1 1 ...
## $ temp : num 9.84 9.02 9.02 9.84 9.84 ...
## $ atemp : num 14.4 13.6 13.6 14.4 14.4 ...
## $ humidity : chr "81" "80" "80" "75" ...
## $ windspeed : num 0 0 0 0 0 ...
## $ casual : int 3 8 5 3 0 0 2 1 1 8 ...
## $ registered: int 13 32 27 10 1 1 0 2 7 6 ...
## $ count : int 16 40 32 13 1 1 2 3 8 14 ...
## $ sources : chr "ad campaign" "www.yahoo.com" "www.google.fi" "AD campaign" ...
5: The dataset contains 17,379 observations and 13 variables.
6: The humidity variable is interpreted by R as a character (chr)
type.
Question 7:
bike[6251, "season"]
## [1] 4
The season value in row 6251 is 4.
Question 8:
table(bike$season)
##
## 1 2 3 4
## 4242 4409 4496 4232
Winter (season = 4) has 4,232 observations.
Question 9: Conceptual
Question 10:
nrow(subset(bike, season %in% c(1, 4) & windspeed >= 40))
## [1] 46
There are 46 observations with high wind threat or above (>= 40
mph) occurring during winter (4) or spring (1).