Question 4

Which of the following code would be correct to extract the bike sharing dataset in R?

This is a multiple answer question, select all that apply.

bike3 <- read.csv(“bike_sharing_data.csv”) bike2 <- read.table(“bike_sharing_data.txt”, sep=“, header=TRUE) bike4 <- read.delim(”bike_sharing_data.txt”) bike1 <- read.table(“bike_sharing_data.csv”, sep=“,”, header=TRUE)

bike3 <- read.csv("bike_sharing_data.csv")
bike2 <- read.table("bike_sharing_data.txt", sep="\t", header=TRUE)
bike4 <- read.delim("bike_sharing_data.txt")
bike1 <- read.table("bike_sharing_data.csv", sep=",", header=TRUE)

All codes would be able to extract the bike sharing dataset in R

Question 5

What is the total number of observations and variables for the bike sharing dataset?

The data environment on the side shows that there are 17379 observations and 13 variables in the bike sharing dataset

Question 6

If you import the bike sharing dataset in R using the above selected coding

approaches in Q5, what is data type of humidity perceived by R?

str(bike1)

str(bike1)
## 'data.frame':    17379 obs. of  13 variables:
##  $ datetime  : chr  "1/1/2011 0:00" "1/1/2011 1:00" "1/1/2011 2:00" "1/1/2011 3:00" ...
##  $ season    : int  1 1 1 1 1 1 1 1 1 1 ...
##  $ holiday   : int  0 0 0 0 0 0 0 0 0 0 ...
##  $ workingday: int  0 0 0 0 0 0 0 0 0 0 ...
##  $ weather   : int  1 1 1 1 1 2 1 1 1 1 ...
##  $ temp      : num  9.84 9.02 9.02 9.84 9.84 ...
##  $ atemp     : num  14.4 13.6 13.6 14.4 14.4 ...
##  $ humidity  : chr  "81" "80" "80" "75" ...
##  $ windspeed : num  0 0 0 0 0 ...
##  $ casual    : int  3 8 5 3 0 0 2 1 1 8 ...
##  $ registered: int  13 32 27 10 1 1 0 2 7 6 ...
##  $ count     : int  16 40 32 13 1 1 2 3 8 14 ...
##  $ sources   : chr  "ad campaign" "www.yahoo.com" "www.google.fi" "AD campaign" ...

By checking the structure of the bike sharing dataset,

we can determine that humidity is percieved as a character

Question 7

What is the value of season in row 6251?

bike1[6251,]

bike1[6251,]
##            datetime season holiday workingday weather  temp  atemp humidity
## 6251 9/23/2011 0:00      4       0          1       2 25.42 27.275       94
##      windspeed casual registered count     sources
## 6251    6.0032      5         23    28 Ad Campaign

By indexing to retrieve row 6251, we can determine that the

value of season in this row is 4

Question 8

How many observations have the season as winter?

table(bike1$season)

table(bike1$season)
## 
##    1    2    3    4 
## 4242 4409 4496 4232

According to the data fields, the season “winter” is represented by the integer “4”

By creating a contingency table and looking at season, we can determine that there

are 4232 observations that have the season winter

Question 10

How many observations having a “High” wind threat condition or above during winter and spring?

High wind threat condition has sustained speeds of at least 40 mph

sum(bike1$windspeed >= 40)

sum(bike1$windspeed >= 40)
## [1] 63

The sum of all observations with a wind speed of 40 or greater is 63 observations