Justin Kaplan

Workshop 1

Load in the data-set

library(readr)
Bike_Data <- read_csv("~/Downloads/bike_sharing_data (1).csv")
## Warning: One or more parsing issues, call `problems()` on your data frame for details,
## e.g.:
##   dat <- vroom(...)
##   problems(dat)

Question 5

str(Bike_Data)
## spc_tbl_ [17,379 × 13] (S3: spec_tbl_df/tbl_df/tbl/data.frame)
##  $ datetime  : chr [1:17379] "1/1/2011 0:00" "1/1/2011 1:00" "1/1/2011 2:00" "1/1/2011 3:00" ...
##  $ season    : num [1:17379] 1 1 1 1 1 1 1 1 1 1 ...
##  $ holiday   : num [1:17379] 0 0 0 0 0 0 0 0 0 0 ...
##  $ workingday: num [1:17379] 0 0 0 0 0 0 0 0 0 0 ...
##  $ weather   : num [1:17379] 1 1 1 1 1 2 1 1 1 1 ...
##  $ temp      : num [1:17379] 9.84 9.02 9.02 9.84 9.84 ...
##  $ atemp     : num [1:17379] 14.4 13.6 13.6 14.4 14.4 ...
##  $ humidity  : num [1:17379] 81 80 80 75 75 75 80 86 75 76 ...
##  $ windspeed : num [1:17379] 0 0 0 0 0 ...
##  $ casual    : num [1:17379] 3 8 5 3 0 0 2 1 1 8 ...
##  $ registered: num [1:17379] 13 32 27 10 1 1 0 2 7 6 ...
##  $ count     : num [1:17379] 16 40 32 13 1 1 2 3 8 14 ...
##  $ sources   : chr [1:17379] "ad campaign" "www.yahoo.com" "www.google.fi" "AD campaign" ...
##  - attr(*, "spec")=
##   .. cols(
##   ..   datetime = col_character(),
##   ..   season = col_double(),
##   ..   holiday = col_double(),
##   ..   workingday = col_double(),
##   ..   weather = col_double(),
##   ..   temp = col_double(),
##   ..   atemp = col_double(),
##   ..   humidity = col_double(),
##   ..   windspeed = col_double(),
##   ..   casual = col_double(),
##   ..   registered = col_double(),
##   ..   count = col_double(),
##   ..   sources = col_character()
##   .. )
##  - attr(*, "problems")=<externalptr>
The structure of the table is listed as [17,379 × 13]

Question 6

str(Bike_Data)
## spc_tbl_ [17,379 × 13] (S3: spec_tbl_df/tbl_df/tbl/data.frame)
##  $ datetime  : chr [1:17379] "1/1/2011 0:00" "1/1/2011 1:00" "1/1/2011 2:00" "1/1/2011 3:00" ...
##  $ season    : num [1:17379] 1 1 1 1 1 1 1 1 1 1 ...
##  $ holiday   : num [1:17379] 0 0 0 0 0 0 0 0 0 0 ...
##  $ workingday: num [1:17379] 0 0 0 0 0 0 0 0 0 0 ...
##  $ weather   : num [1:17379] 1 1 1 1 1 2 1 1 1 1 ...
##  $ temp      : num [1:17379] 9.84 9.02 9.02 9.84 9.84 ...
##  $ atemp     : num [1:17379] 14.4 13.6 13.6 14.4 14.4 ...
##  $ humidity  : num [1:17379] 81 80 80 75 75 75 80 86 75 76 ...
##  $ windspeed : num [1:17379] 0 0 0 0 0 ...
##  $ casual    : num [1:17379] 3 8 5 3 0 0 2 1 1 8 ...
##  $ registered: num [1:17379] 13 32 27 10 1 1 0 2 7 6 ...
##  $ count     : num [1:17379] 16 40 32 13 1 1 2 3 8 14 ...
##  $ sources   : chr [1:17379] "ad campaign" "www.yahoo.com" "www.google.fi" "AD campaign" ...
##  - attr(*, "spec")=
##   .. cols(
##   ..   datetime = col_character(),
##   ..   season = col_double(),
##   ..   holiday = col_double(),
##   ..   workingday = col_double(),
##   ..   weather = col_double(),
##   ..   temp = col_double(),
##   ..   atemp = col_double(),
##   ..   humidity = col_double(),
##   ..   windspeed = col_double(),
##   ..   casual = col_double(),
##   ..   registered = col_double(),
##   ..   count = col_double(),
##   ..   sources = col_character()
##   .. )
##  - attr(*, "problems")=<externalptr>
Humidity is listed as a numerical data type

Question 7

When going down to row 6251, the value for the season is 4

Question 8

table(Bike_Data$season)
## 
##    1    2    3    4 
## 4242 4409 4496 4232
The table says that 4,232 rentals were made during the winter

Question 10

subset(Bike_Data, (windspeed >=40) & (season %in% c ("1", "4")))
## # A tibble: 46 × 13
##    datetime     season holiday workingday weather  temp atemp humidity windspeed
##    <chr>         <dbl>   <dbl>      <dbl>   <dbl> <dbl> <dbl>    <dbl>     <dbl>
##  1 2/14/2011 1…      1       0          1       1  23.0  26.5       21      44.0
##  2 2/14/2011 1…      1       0          1       1  18.9  22.7       33      41.0
##  3 2/14/2011 1…      1       0          1       1  16.4  20.5       40      41.0
##  4 2/14/2011 2…      1       0          1       1  13.9  14.4       46      44.0
##  5 2/15/2011 1…      1       0          1       1  12.3  12.1       42      52.0
##  6 2/15/2011 2…      1       0          1       1  11.5  11.4       41      46.0
##  7 2/19/2011 9…      1       0          0       1  16.4  20.5       16      44.0
##  8 2/19/2011 1…      1       0          0       1  18.0  22.0       16      41.0
##  9 2/19/2011 1…      1       0          0       1  18.9  22.7       15      44.0
## 10 2/19/2011 1…      1       0          0       1  18.0  22.0       16      50.0
## # ℹ 36 more rows
## # ℹ 4 more variables: casual <dbl>, registered <dbl>, count <dbl>,
## #   sources <chr>

When filtering by high wind speeds of forty miles an hour or faster and results of winter and spring there are 46 results