Last year, I started a project called Skate project on YouTube. Whose intent was to make myself and the viewer get good at skateboarding using data. I explain how i do it using software like R. I draw graphs and find the standard deviation which reveals a lot of information. That was the word for word explanation i extracted from YouTube. Now, this is a report of that analysis with a codebook - which i recommend the reader to look at before reading through this or replicating this.
Our goal is, to find out what tricks I’m good at and recommend which tricks i should start with in a competition like game of S.K.A.T.E sometimes called H.O.R.S.E or in skate lines using data. First things, first Importing and Cleaning.
raw data: google sheets download
tidy data: cleaner dataset download
Doing all the necessary imports of all the packages, we’ll require throughout the analysis. Plus, bringing the data into R from where you stored it in the computer.
library(psych)
library(ggplot2)
Attaching package: ‘ggplot2’
The following objects are masked from ‘package:psych’:
%+%, alpha
library(magrittr)
library(dplyr)
Attaching package: ‘dplyr’
The following objects are masked from ‘package:stats’:
filter, lag
The following objects are masked from ‘package:base’:
intersect, setdiff, setequal, union
library(lubridate)
Attaching package: ‘lubridate’
The following object is masked from ‘package:base’:
date
library(tidyr)
Attaching package: ‘tidyr’
The following object is masked from ‘package:magrittr’:
extract
library(Hmisc)Loading required package: lattice
Loading required package: survival
Loading required package: Formula
Attaching package: ‘Hmisc’
The following objects are masked from ‘package:dplyr’:
combine, src, summarize
The following object is masked from ‘package:psych’:
describe
The following objects are masked from ‘package:base’:
format.pval, round.POSIXt, trunc.POSIXt, units
library(gridExtra)
Attaching package: ‘gridExtra’
The following object is masked from ‘package:Hmisc’:
combine
The following object is masked from ‘package:dplyr’:
combine
# change to the new csv file with parsed dates and change the columns
# place to numbers later
path <- file.path("/Users/brendamainye/Downloads/datasets/skate_project1.csv")
skate_df <- read.csv(path)Then here, we’re exploring the dataset for clues that we require to clean it in terms of: inconsistent column names(fixed this already), missing data (there’s one day i didn’t record. So i backfilled it with date ahead of it 2016-07-29 - refer to thecodebook for more information), outliers (the data is scaled already 0 - 5, this won’t be a problem this time), duplicate rows (none), Untidy (this is a tidy dataset, don’t you think?). Do you see anything strange?
dim(skate_df) # 29 rows and 23 columns[1] 29 23
names(skate_df) # the column names do not require cleaning [1] "X" "ollie" "fs.180" "bs.180"
[5] "pop.shove" "fs.shove" "kickflip" "heelflip"
[9] "f.180" "f.ollie" "f.shove" "f.fs.shove"
[13] "sw.fs.bone" "sw.fs.no.comply" "f.bigspin" "sw.bone"
[17] "halfcab" "fingerflip" "sw.no.cmp" "date"
[21] "place" "randomized" "board"
class(skate_df) # data.frame, we need to convert this later[1] "data.frame"
glimpse(skate_df) # Understanding the data setObservations: 29
Variables: 23
$ X <int> 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16...
$ ollie <int> 5, 5, 5, 5, 5, 3, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5,...
$ fs.180 <int> 2, 4, 2, 5, 4, 3, 3, 3, 4, 3, 4, 1, 2, 4, 3, 3, 3, 4, 4,...
$ bs.180 <int> 1, 2, 4, 2, 3, 4, 4, 3, 1, 2, 3, 4, 5, 4, 1, 3, 3, 3, 4,...
$ pop.shove <int> 4, 3, 2, 3, 1, 0, 2, 3, 1, 4, 2, 3, 3, 2, 1, 2, 4, 2, 0,...
$ fs.shove <int> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 2,...
$ kickflip <int> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,...
$ heelflip <int> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,...
$ f.180 <int> 3, 2, 2, 3, 2, 2, 2, 0, 3, 1, 0, 2, 4, 3, 3, 3, 2, 2, 3,...
$ f.ollie <int> 4, 4, 3, 4, 3, 4, 2, 4, 4, 4, 4, 3, 4, 4, 2, 4, 5, 2, 5,...
$ f.shove <int> 4, 4, 3, 5, 3, 4, 3, 4, 5, 3, 5, 5, 2, 2, 2, 3, 3, 3, 3,...
$ f.fs.shove <int> 2, 1, 1, 0, 1, 1, 2, 2, 2, 0, 3, 2, 1, 1, 1, 1, 1, 3, 2,...
$ sw.fs.bone <int> 4, 4, 5, 5, 4, 5, 5, 4, 5, 5, 5, 5, 5, 5, 4, 5, 4, 5, 2,...
$ sw.fs.no.comply <int> 4, 3, 5, 5, 3, 3, 4, 2, 4, 2, 3, 3, 3, 2, 2, 4, 4, 3, 2,...
$ f.bigspin <int> 0, 1, 1, 0, 2, 0, 0, 1, 1, 1, 1, 0, 0, 0, 1, 0, 0, 2, 0,...
$ sw.bone <int> 4, 4, 5, 5, 4, 2, 5, 4, 5, 4, 2, 5, 5, 5, 5, 5, 5, 5, 4,...
$ halfcab <int> 4, 4, 3, 5, 5, 5, 4, 3, 4, 4, 5, 4, 2, 5, 5, 5, 5, 5, 5,...
$ fingerflip <int> 3, 3, 3, 4, 4, 1, 2, 2, 1, 4, 3, 2, 5, 2, 3, 4, 2, 2, 1,...
$ sw.no.cmp <int> 4, 4, 3, 5, 3, 2, 1, 3, 3, 3, 1, 4, 2, 3, 2, 2, 4, 4, 4,...
$ date <fctr> 2016-10-07, 2016-07-29, 2016-07-29, 2016-05-08, 2016-08...
$ place <fctr> marist, uhuru, comboni, marist, mparkkaren, comboni, ma...
$ randomized <int> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1,...
$ board <int> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1,...
summary(skate_df) # Understanding the data set: Summary statistics X ollie fs.180 bs.180 pop.shove
Min. : 0 Min. :3.000 Min. :1.000 Min. :1.000 Min. :0.000
1st Qu.: 7 1st Qu.:5.000 1st Qu.:3.000 1st Qu.:2.000 1st Qu.:1.000
Median :14 Median :5.000 Median :3.000 Median :3.000 Median :2.000
Mean :14 Mean :4.931 Mean :3.241 Mean :2.966 Mean :2.172
3rd Qu.:21 3rd Qu.:5.000 3rd Qu.:4.000 3rd Qu.:4.000 3rd Qu.:3.000
Max. :28 Max. :5.000 Max. :5.000 Max. :5.000 Max. :4.000
fs.shove kickflip heelflip f.180 f.ollie
Min. :0.0000 Min. :0 Min. :0 Min. :0.000 Min. :2.000
1st Qu.:0.0000 1st Qu.:0 1st Qu.:0 1st Qu.:1.000 1st Qu.:3.000
Median :0.0000 Median :0 Median :0 Median :2.000 Median :4.000
Mean :0.2414 Mean :0 Mean :0 Mean :2.034 Mean :3.759
3rd Qu.:0.0000 3rd Qu.:0 3rd Qu.:0 3rd Qu.:3.000 3rd Qu.:4.000
Max. :2.0000 Max. :0 Max. :0 Max. :4.000 Max. :5.000
f.shove f.fs.shove sw.fs.bone sw.fs.no.comply f.bigspin
Min. :2.000 Min. :0.000 Min. :2.000 Min. :1.000 Min. :0.0000
1st Qu.:3.000 1st Qu.:1.000 1st Qu.:4.000 1st Qu.:2.000 1st Qu.:0.0000
Median :4.000 Median :2.000 Median :5.000 Median :3.000 Median :1.0000
Mean :3.621 Mean :1.517 Mean :4.517 Mean :3.172 Mean :0.6897
3rd Qu.:4.000 3rd Qu.:2.000 3rd Qu.:5.000 3rd Qu.:4.000 3rd Qu.:1.0000
Max. :5.000 Max. :3.000 Max. :5.000 Max. :5.000 Max. :2.0000
sw.bone halfcab fingerflip sw.no.cmp date
Min. :2.000 Min. :2.000 Min. :1.000 Min. :1.000 2016-07-29: 2
1st Qu.:4.000 1st Qu.:4.000 1st Qu.:2.000 1st Qu.:2.000 2016-02-12: 1
Median :5.000 Median :5.000 Median :3.000 Median :3.000 2016-04-09: 1
Mean :4.345 Mean :4.414 Mean :2.724 Mean :3.069 2016-05-08: 1
3rd Qu.:5.000 3rd Qu.:5.000 3rd Qu.:4.000 3rd Qu.:4.000 2016-08-14: 1
Max. :5.000 Max. :5.000 Max. :5.000 Max. :5.000 2016-08-28: 1
(Other) :22
place randomized board
comboni : 8 Min. :0.0000 Min. :0.0000
hilltoprdng: 1 1st Qu.:0.0000 1st Qu.:0.0000
marist :10 Median :1.0000 Median :0.0000
mparkkaren : 1 Mean :0.6552 Mean :0.4828
skatepark : 5 3rd Qu.:1.0000 3rd Qu.:1.0000
uhuru : 4 Max. :1.0000 Max. :1.0000
describe(skate_df) # Understanding the data set: Summary statistics continued answers sono non-missing arguments to min; returning Infno non-missing arguments to min; returning Inf
skate_df
23 Variables 29 Observations
-------------------------------------------------------------------------------------
X
n missing distinct Info Mean Gmd .05 .10 .25
29 0 29 1 14 10 1.4 2.8 7.0
.50 .75 .90 .95
14.0 21.0 25.2 26.6
lowest : 0 1 2 3 4, highest: 24 25 26 27 28
-------------------------------------------------------------------------------------
ollie
n missing distinct Info Mean Gmd
29 0 2 0.1 4.931 0.1379
Value 3 5
Frequency 1 28
Proportion 0.034 0.966
-------------------------------------------------------------------------------------
fs.180
n missing distinct Info Mean Gmd
29 0 5 0.9 3.241 1.044
Value 1 2 3 4 5
Frequency 1 5 11 10 2
Proportion 0.034 0.172 0.379 0.345 0.069
-------------------------------------------------------------------------------------
bs.180
n missing distinct Info Mean Gmd
29 0 5 0.931 2.966 1.291
Value 1 2 3 4 5
Frequency 4 5 10 8 2
Proportion 0.138 0.172 0.345 0.276 0.069
-------------------------------------------------------------------------------------
pop.shove
n missing distinct Info Mean Gmd
29 0 5 0.932 2.172 1.31
Value 0 1 2 3 4
Frequency 3 5 8 10 3
Proportion 0.103 0.172 0.276 0.345 0.103
-------------------------------------------------------------------------------------
fs.shove
n missing distinct Info Mean Gmd
29 0 3 0.432 0.2414 0.4286
Value 0 1 2
Frequency 24 3 2
Proportion 0.828 0.103 0.069
-------------------------------------------------------------------------------------
kickflip
n missing distinct Info Mean Gmd
29 0 1 0 0 0
Value 0
Frequency 29
Proportion 1
-------------------------------------------------------------------------------------
heelflip
n missing distinct Info Mean Gmd
29 0 1 0 0 0
Value 0
Frequency 29
Proportion 1
-------------------------------------------------------------------------------------
f.180
n missing distinct Info Mean Gmd
29 0 5 0.913 2.034 1.163
Value 0 1 2 3 4
Frequency 3 5 10 10 1
Proportion 0.103 0.172 0.345 0.345 0.034
-------------------------------------------------------------------------------------
f.ollie
n missing distinct Info Mean Gmd
29 0 4 0.823 3.759 0.9557
Value 2 3 4 5
Frequency 4 4 16 5
Proportion 0.138 0.138 0.552 0.172
-------------------------------------------------------------------------------------
f.shove
n missing distinct Info Mean Gmd
29 0 4 0.907 3.621 1.049
Value 2 3 4 5
Frequency 3 11 9 6
Proportion 0.103 0.379 0.310 0.207
-------------------------------------------------------------------------------------
f.fs.shove
n missing distinct Info Mean Gmd
29 0 4 0.9 1.517 1.01
Value 0 1 2 3
Frequency 4 10 11 4
Proportion 0.138 0.345 0.379 0.138
-------------------------------------------------------------------------------------
sw.fs.bone
n missing distinct Info Mean Gmd
29 0 4 0.705 4.517 0.7291
Value 2 3 4 5
Frequency 1 2 7 19
Proportion 0.034 0.069 0.241 0.655
-------------------------------------------------------------------------------------
sw.fs.no.comply
n missing distinct Info Mean Gmd
29 0 5 0.924 3.172 1.163
Value 1 2 3 4 5
Frequency 1 7 10 8 3
Proportion 0.034 0.241 0.345 0.276 0.103
-------------------------------------------------------------------------------------
f.bigspin
n missing distinct Info Mean Gmd
29 0 3 0.837 0.6897 0.7586
Value 0 1 2
Frequency 13 12 4
Proportion 0.448 0.414 0.138
-------------------------------------------------------------------------------------
sw.bone
n missing distinct Info Mean Gmd
29 0 4 0.802 4.345 0.8916
Value 2 3 4 5
Frequency 2 2 9 16
Proportion 0.069 0.069 0.310 0.552
-------------------------------------------------------------------------------------
halfcab
n missing distinct Info Mean Gmd
29 0 4 0.777 4.414 0.8177
Value 2 3 4 5
Frequency 1 3 8 17
Proportion 0.034 0.103 0.276 0.586
-------------------------------------------------------------------------------------
fingerflip
n missing distinct Info Mean Gmd
29 0 5 0.946 2.724 1.419
Value 1 2 3 4 5
Frequency 5 9 7 5 3
Proportion 0.172 0.310 0.241 0.172 0.103
-------------------------------------------------------------------------------------
sw.no.cmp
n missing distinct Info Mean Gmd
29 0 5 0.927 3.069 1.197
Value 1 2 3 4 5
Frequency 2 7 9 9 2
Proportion 0.069 0.241 0.310 0.310 0.069
-------------------------------------------------------------------------------------
date
n missing distinct
29 0 28
lowest : 2016-02-12 2016-04-09 2016-05-08 2016-07-29 2016-08-14
highest: 2017-06-30 2017-07-15 2017-07-22 2017-09-06 2017-12-02
-------------------------------------------------------------------------------------
place
n missing distinct
29 0 6
Value comboni hilltoprdng marist mparkkaren skatepark uhuru
Frequency 8 1 10 1 5 4
Proportion 0.276 0.034 0.345 0.034 0.172 0.138
-------------------------------------------------------------------------------------
randomized
n missing distinct Info Sum Mean Gmd
29 0 2 0.679 19 0.6552 0.468
-------------------------------------------------------------------------------------
board
n missing distinct Info Sum Mean Gmd
29 0 2 0.75 14 0.4828 0.5172
-------------------------------------------------------------------------------------
# many questions we'll explore in the dataset
skate_df <- tbl_df(skate_df) # converts to a format liked by dplyr
skate_df$board <- factor(skate_df$board) #convert into factors
skate_df$randomized <- factor(skate_df$randomized) #convert into factors
(skate_df <- skate_df %>% select(-X)) # removing the X column since its just counts. And print results. skate_df <- skate_df[-1]We’ll do a lot of summary statistics and plotting here. This is done to find patterns in the data. Which i hope you’ll see as we go through this analysis. We’ll see if the mean of the tricks vary in an unusual way grouped by the following variables namely randomized and place. We’ll not focus on this since i don’t want to develop some biases. But with time we shall. Do you see anything interesting?
# Does the mean vary among the places i've visited?
describeBy(skate_df, group = skate_df$place)
Descriptive statistics by group
group: comboni
vars n mean sd median trimmed mad min max range skew kurtosis
ollie 1 8 4.75 0.71 5.0 4.75 0.00 3 5 2 -1.86 1.70
fs.180 2 8 3.25 1.04 3.0 3.25 1.48 2 5 3 0.25 -1.38
bs.180 3 8 3.00 1.51 3.5 3.00 1.48 1 5 4 -0.22 -1.76
pop.shove 4 8 1.75 1.28 1.5 1.75 0.74 0 4 4 0.40 -1.22
fs.shove 5 8 0.38 0.74 0.0 0.38 0.00 0 2 2 1.28 -0.05
kickflip 6 8 0.00 0.00 0.0 0.00 0.00 0 0 0 NaN NaN
heelflip 7 8 0.00 0.00 0.0 0.00 0.00 0 0 0 NaN NaN
f.180 8 8 2.00 0.76 2.0 2.00 0.74 1 3 2 0.00 -1.47
f.ollie 9 8 4.00 1.07 4.0 4.00 1.48 2 5 3 -0.61 -1.09
f.shove 10 8 3.50 0.93 3.5 3.50 0.74 2 5 3 0.00 -1.21
f.fs.shove 11 8 1.25 0.71 1.0 1.25 0.74 0 2 2 -0.27 -1.30
sw.fs.bone 12 8 4.75 0.46 5.0 4.75 0.00 4 5 1 -0.95 -1.21
sw.fs.no.comply 13 8 3.38 1.30 4.0 3.38 0.74 1 5 4 -0.61 -1.13
f.bigspin 14 8 0.50 0.53 0.5 0.50 0.74 0 1 1 0.00 -2.23
sw.bone 15 8 4.25 1.16 5.0 4.25 0.00 2 5 3 -0.89 -0.99
halfcab 16 8 4.50 0.76 5.0 4.50 0.00 3 5 2 -0.87 -0.89
fingerflip 17 8 2.50 1.31 2.5 2.50 0.74 1 5 4 0.50 -0.89
sw.no.cmp 18 8 3.00 0.76 3.0 3.00 0.74 2 4 2 0.00 -1.47
date* 19 8 15.00 8.77 15.0 15.00 11.86 4 28 24 0.16 -1.68
place* 20 8 1.00 0.00 1.0 1.00 0.00 1 1 0 NaN NaN
randomized* 21 8 1.62 0.52 2.0 1.62 0.00 1 2 1 -0.42 -2.03
board* 22 8 1.50 0.53 1.5 1.50 0.74 1 2 1 0.00 -2.23
se
ollie 0.25
fs.180 0.37
bs.180 0.53
pop.shove 0.45
fs.shove 0.26
kickflip 0.00
heelflip 0.00
f.180 0.27
f.ollie 0.38
f.shove 0.33
f.fs.shove 0.25
sw.fs.bone 0.16
sw.fs.no.comply 0.46
f.bigspin 0.19
sw.bone 0.41
halfcab 0.27
fingerflip 0.46
sw.no.cmp 0.27
date* 3.10
place* 0.00
randomized* 0.18
board* 0.19
---------------------------------------------------------------
group: hilltoprdng
vars n mean sd median trimmed mad min max range skew kurtosis se
ollie 1 1 5 NA 5 5 0 5 5 0 NA NA NA
fs.180 2 1 4 NA 4 4 0 4 4 0 NA NA NA
bs.180 3 1 4 NA 4 4 0 4 4 0 NA NA NA
pop.shove 4 1 0 NA 0 0 0 0 0 0 NA NA NA
fs.shove 5 1 0 NA 0 0 0 0 0 0 NA NA NA
kickflip 6 1 0 NA 0 0 0 0 0 0 NA NA NA
heelflip 7 1 0 NA 0 0 0 0 0 0 NA NA NA
f.180 8 1 3 NA 3 3 0 3 3 0 NA NA NA
f.ollie 9 1 4 NA 4 4 0 4 4 0 NA NA NA
f.shove 10 1 3 NA 3 3 0 3 3 0 NA NA NA
f.fs.shove 11 1 0 NA 0 0 0 0 0 0 NA NA NA
sw.fs.bone 12 1 5 NA 5 5 0 5 5 0 NA NA NA
sw.fs.no.comply 13 1 3 NA 3 3 0 3 3 0 NA NA NA
f.bigspin 14 1 0 NA 0 0 0 0 0 0 NA NA NA
sw.bone 15 1 5 NA 5 5 0 5 5 0 NA NA NA
halfcab 16 1 3 NA 3 3 0 3 3 0 NA NA NA
fingerflip 17 1 5 NA 5 5 0 5 5 0 NA NA NA
sw.no.cmp 18 1 2 NA 2 2 0 2 2 0 NA NA NA
date* 19 1 21 NA 21 21 0 21 21 0 NA NA NA
place* 20 1 2 NA 2 2 0 2 2 0 NA NA NA
randomized* 21 1 2 NA 2 2 0 2 2 0 NA NA NA
board* 22 1 2 NA 2 2 0 2 2 0 NA NA NA
---------------------------------------------------------------
group: marist
vars n mean sd median trimmed mad min max range skew kurtosis
ollie 1 10 5.0 0.00 5.0 5.00 0.00 5 5 0 NaN NaN
fs.180 2 10 2.9 1.10 3.0 2.88 0.74 1 5 4 0.17 -0.60
bs.180 3 10 3.1 1.20 3.0 3.12 1.48 1 5 4 -0.17 -1.18
pop.shove 4 10 2.5 1.27 3.0 2.62 1.48 0 4 4 -0.59 -0.90
fs.shove 5 10 0.4 0.70 0.0 0.25 0.00 0 2 2 1.19 -0.07
kickflip 6 10 0.0 0.00 0.0 0.00 0.00 0 0 0 NaN NaN
heelflip 7 10 0.0 0.00 0.0 0.00 0.00 0 0 0 NaN NaN
f.180 8 10 2.4 1.17 3.0 2.50 0.74 0 4 4 -0.71 -0.67
f.ollie 9 10 3.7 0.82 4.0 3.75 0.00 2 5 3 -0.58 -0.45
f.shove 10 10 3.7 1.06 3.5 3.75 0.74 2 5 3 0.03 -1.58
f.fs.shove 11 10 1.4 0.97 1.5 1.38 0.74 0 3 3 -0.08 -1.30
sw.fs.bone 12 10 4.6 0.97 5.0 4.88 0.00 2 5 3 -1.92 2.28
sw.fs.no.comply 13 10 3.6 1.07 4.0 3.62 1.48 2 5 3 -0.23 -1.42
f.bigspin 14 10 0.4 0.70 0.0 0.25 0.00 0 2 2 1.19 -0.07
sw.bone 15 10 4.5 0.71 5.0 4.62 0.00 3 5 2 -0.85 -0.75
halfcab 16 10 4.3 0.95 4.5 4.50 0.74 2 5 3 -1.24 0.61
fingerflip 17 10 2.8 1.40 2.5 2.75 2.22 1 5 4 0.10 -1.64
sw.no.cmp 18 10 3.1 1.29 3.5 3.12 1.48 1 5 4 -0.16 -1.56
date* 19 10 11.7 8.33 11.0 11.50 11.12 1 24 23 0.13 -1.57
place* 20 10 3.0 0.00 3.0 3.00 0.00 3 3 0 NaN NaN
randomized* 21 10 1.6 0.52 2.0 1.62 0.00 1 2 1 -0.35 -2.05
board* 22 10 1.4 0.52 1.0 1.38 0.00 1 2 1 0.35 -2.05
se
ollie 0.00
fs.180 0.35
bs.180 0.38
pop.shove 0.40
fs.shove 0.22
kickflip 0.00
heelflip 0.00
f.180 0.37
f.ollie 0.26
f.shove 0.33
f.fs.shove 0.31
sw.fs.bone 0.31
sw.fs.no.comply 0.34
f.bigspin 0.22
sw.bone 0.22
halfcab 0.30
fingerflip 0.44
sw.no.cmp 0.41
date* 2.63
place* 0.00
randomized* 0.16
board* 0.16
---------------------------------------------------------------
group: mparkkaren
vars n mean sd median trimmed mad min max range skew kurtosis se
ollie 1 1 5 NA 5 5 0 5 5 0 NA NA NA
fs.180 2 1 4 NA 4 4 0 4 4 0 NA NA NA
bs.180 3 1 3 NA 3 3 0 3 3 0 NA NA NA
pop.shove 4 1 1 NA 1 1 0 1 1 0 NA NA NA
fs.shove 5 1 0 NA 0 0 0 0 0 0 NA NA NA
kickflip 6 1 0 NA 0 0 0 0 0 0 NA NA NA
heelflip 7 1 0 NA 0 0 0 0 0 0 NA NA NA
f.180 8 1 2 NA 2 2 0 2 2 0 NA NA NA
f.ollie 9 1 3 NA 3 3 0 3 3 0 NA NA NA
f.shove 10 1 3 NA 3 3 0 3 3 0 NA NA NA
f.fs.shove 11 1 1 NA 1 1 0 1 1 0 NA NA NA
sw.fs.bone 12 1 4 NA 4 4 0 4 4 0 NA NA NA
sw.fs.no.comply 13 1 3 NA 3 3 0 3 3 0 NA NA NA
f.bigspin 14 1 2 NA 2 2 0 2 2 0 NA NA NA
sw.bone 15 1 4 NA 4 4 0 4 4 0 NA NA NA
halfcab 16 1 5 NA 5 5 0 5 5 0 NA NA NA
fingerflip 17 1 4 NA 4 4 0 4 4 0 NA NA NA
sw.no.cmp 18 1 3 NA 3 3 0 3 3 0 NA NA NA
date* 19 1 5 NA 5 5 0 5 5 0 NA NA NA
place* 20 1 4 NA 4 4 0 4 4 0 NA NA NA
randomized* 21 1 1 NA 1 1 0 1 1 0 NA NA NA
board* 22 1 1 NA 1 1 0 1 1 0 NA NA NA
---------------------------------------------------------------
group: skatepark
vars n mean sd median trimmed mad min max range skew kurtosis
ollie 1 5 5.0 0.00 5 5.0 0.00 5 5 0 NaN NaN
fs.180 2 5 3.4 0.89 4 3.4 0.00 2 4 2 -0.60 -1.67
bs.180 3 5 3.0 0.71 3 3.0 0.00 2 4 2 0.00 -1.40
pop.shove 4 5 2.4 0.55 2 2.4 0.00 2 3 1 0.29 -2.25
fs.shove 5 5 0.0 0.00 0 0.0 0.00 0 0 0 NaN NaN
kickflip 6 5 0.0 0.00 0 0.0 0.00 0 0 0 NaN NaN
heelflip 7 5 0.0 0.00 0 0.0 0.00 0 0 0 NaN NaN
f.180 8 5 1.6 1.14 2 1.6 1.48 0 3 3 -0.19 -1.75
f.ollie 9 5 3.4 1.34 4 3.4 1.48 2 5 3 -0.08 -2.11
f.shove 10 5 3.6 1.14 4 3.6 1.48 2 5 3 -0.19 -1.75
f.fs.shove 11 5 2.2 0.84 2 2.2 1.48 1 3 2 -0.25 -1.82
sw.fs.bone 12 5 4.4 0.89 5 4.4 0.00 3 5 2 -0.60 -1.67
sw.fs.no.comply 13 5 2.6 0.55 3 2.6 0.00 2 3 1 -0.29 -2.25
f.bigspin 14 5 1.2 0.84 1 1.2 1.48 0 2 2 -0.25 -1.82
sw.bone 15 5 4.0 1.22 4 4.0 1.48 2 5 3 -0.65 -1.40
halfcab 16 5 5.0 0.00 5 5.0 0.00 5 5 0 NaN NaN
fingerflip 17 5 2.6 0.89 2 2.6 0.00 2 4 2 0.60 -1.67
sw.no.cmp 18 5 3.0 1.22 3 3.0 1.48 1 4 3 -0.65 -1.40
date* 19 5 17.8 6.22 17 17.8 7.41 11 26 15 0.17 -2.00
place* 20 5 5.0 0.00 5 5.0 0.00 5 5 0 NaN NaN
randomized* 21 5 2.0 0.00 2 2.0 0.00 2 2 0 NaN NaN
board* 22 5 1.6 0.55 2 1.6 0.00 1 2 1 -0.29 -2.25
se
ollie 0.00
fs.180 0.40
bs.180 0.32
pop.shove 0.24
fs.shove 0.00
kickflip 0.00
heelflip 0.00
f.180 0.51
f.ollie 0.60
f.shove 0.51
f.fs.shove 0.37
sw.fs.bone 0.40
sw.fs.no.comply 0.24
f.bigspin 0.37
sw.bone 0.55
halfcab 0.00
fingerflip 0.40
sw.no.cmp 0.55
date* 2.78
place* 0.00
randomized* 0.00
board* 0.24
---------------------------------------------------------------
group: uhuru
vars n mean sd median trimmed mad min max range skew kurtosis
ollie 1 4 5.00 0.00 5.0 5.00 0.00 5 5 0 NaN NaN
fs.180 2 4 3.50 0.58 3.5 3.50 0.74 3 4 1 0.00 -2.44
bs.180 3 4 2.25 0.96 2.5 2.25 0.74 1 3 2 -0.32 -2.08
pop.shove 4 4 2.75 0.50 3.0 2.75 0.00 2 3 1 -0.75 -1.69
fs.shove 5 4 0.00 0.00 0.0 0.00 0.00 0 0 0 NaN NaN
kickflip 6 4 0.00 0.00 0.0 0.00 0.00 0 0 0 NaN NaN
heelflip 7 4 0.00 0.00 0.0 0.00 0.00 0 0 0 NaN NaN
f.180 8 4 1.50 1.29 1.5 1.50 1.48 0 3 3 0.00 -2.08
f.ollie 9 4 4.00 0.00 4.0 4.00 0.00 4 4 0 NaN NaN
f.shove 10 4 4.00 0.82 4.0 4.00 0.74 3 5 2 0.00 -1.88
f.fs.shove 11 4 2.00 0.82 2.0 2.00 0.74 1 3 2 0.00 -1.88
sw.fs.bone 12 4 4.00 0.82 4.0 4.00 0.74 3 5 2 0.00 -1.88
sw.fs.no.comply 13 4 2.50 0.58 2.5 2.50 0.74 2 3 1 0.00 -2.44
f.bigspin 14 4 1.00 0.00 1.0 1.00 0.00 1 1 0 NaN NaN
sw.bone 15 4 4.50 0.58 4.5 4.50 0.74 4 5 1 0.00 -2.44
halfcab 16 4 4.00 0.82 4.0 4.00 0.74 3 5 2 0.00 -1.88
fingerflip 17 4 2.25 0.96 2.5 2.25 0.74 1 3 2 -0.32 -2.08
sw.no.cmp 18 4 3.50 1.29 3.5 3.50 1.48 2 5 3 0.00 -2.08
date* 19 4 14.50 10.85 13.5 14.50 11.86 4 27 23 0.11 -2.27
place* 20 4 6.00 0.00 6.0 6.00 0.00 6 6 0 NaN NaN
randomized* 21 4 1.50 0.58 1.5 1.50 0.74 1 2 1 0.00 -2.44
board* 22 4 1.50 0.58 1.5 1.50 0.74 1 2 1 0.00 -2.44
se
ollie 0.00
fs.180 0.29
bs.180 0.48
pop.shove 0.25
fs.shove 0.00
kickflip 0.00
heelflip 0.00
f.180 0.65
f.ollie 0.00
f.shove 0.41
f.fs.shove 0.41
sw.fs.bone 0.41
sw.fs.no.comply 0.29
f.bigspin 0.00
sw.bone 0.29
halfcab 0.41
fingerflip 0.48
sw.no.cmp 0.65
date* 5.42
place* 0.00
randomized* 0.29
board* 0.29
# 0 means old skateboard Nocto - it's broken now and 1 one means Darcy the new skateboard
describeBy(skate_df, group = skate_df$board)
Descriptive statistics by group
group: 0
vars n mean sd median trimmed mad min max range skew kurtosis
ollie 1 15 4.87 0.52 5 5.00 0.00 3 5 2 -3.13 8.39
fs.180 2 15 3.13 1.06 3 3.15 1.48 1 5 4 -0.24 -0.86
bs.180 3 15 2.87 1.30 3 2.85 1.48 1 5 4 -0.14 -1.44
pop.shove 4 15 2.27 1.16 2 2.31 1.48 0 4 4 -0.23 -1.04
fs.shove 5 15 0.00 0.00 0 0.00 0.00 0 0 0 NaN NaN
kickflip 6 15 0.00 0.00 0 0.00 0.00 0 0 0 NaN NaN
heelflip 7 15 0.00 0.00 0 0.00 0.00 0 0 0 NaN NaN
f.180 8 15 2.13 1.13 2 2.15 1.48 0 4 4 -0.52 -0.59
f.ollie 9 15 3.53 0.74 4 3.62 0.00 2 4 2 -1.08 -0.43
f.shove 10 15 3.60 1.12 4 3.62 1.48 2 5 3 -0.09 -1.50
f.fs.shove 11 15 1.33 0.82 1 1.31 1.48 0 3 3 0.14 -0.73
sw.fs.bone 12 15 4.67 0.49 5 4.69 0.00 4 5 1 -0.64 -1.69
sw.fs.no.comply 13 15 3.20 1.01 3 3.15 1.48 2 5 3 0.40 -1.08
f.bigspin 14 15 0.60 0.63 1 0.54 1.48 0 2 2 0.44 -0.95
sw.bone 15 15 4.27 1.03 5 4.38 0.00 2 5 3 -1.22 0.23
halfcab 16 15 4.13 0.92 4 4.23 1.48 2 5 3 -0.76 -0.40
fingerflip 17 15 2.80 1.15 3 2.77 1.48 1 5 4 0.10 -0.98
sw.no.cmp 18 15 2.87 1.13 3 2.85 1.48 1 5 4 -0.04 -0.85
date* 19 15 7.27 4.13 7 7.23 4.45 1 14 13 0.12 -1.44
place* 20 15 3.20 1.74 3 3.15 2.97 1 6 5 0.17 -1.27
randomized* 21 15 1.33 0.49 1 1.31 0.00 1 2 1 0.64 -1.69
board* 22 15 1.00 0.00 1 1.00 0.00 1 1 0 NaN NaN
se
ollie 0.13
fs.180 0.27
bs.180 0.34
pop.shove 0.30
fs.shove 0.00
kickflip 0.00
heelflip 0.00
f.180 0.29
f.ollie 0.19
f.shove 0.29
f.fs.shove 0.21
sw.fs.bone 0.13
sw.fs.no.comply 0.26
f.bigspin 0.16
sw.bone 0.27
halfcab 0.24
fingerflip 0.30
sw.no.cmp 0.29
date* 1.07
place* 0.45
randomized* 0.13
board* 0.00
---------------------------------------------------------------
group: 1
vars n mean sd median trimmed mad min max range skew kurtosis
ollie 1 14 5.00 0.00 5.0 5.00 0.00 5 5 0 NaN NaN
fs.180 2 14 3.36 0.84 3.0 3.33 1.48 2 5 3 0.06 -0.86
bs.180 3 14 3.07 1.00 3.0 3.08 0.74 1 5 4 -0.13 -0.32
pop.shove 4 14 2.07 1.21 2.0 2.08 1.48 0 4 4 -0.37 -1.08
fs.shove 5 14 0.50 0.76 0.0 0.42 0.00 0 2 2 0.98 -0.67
kickflip 6 14 0.00 0.00 0.0 0.00 0.00 0 0 0 NaN NaN
heelflip 7 14 0.00 0.00 0.0 0.00 0.00 0 0 0 NaN NaN
f.180 8 14 1.93 1.00 2.0 2.00 1.48 0 3 3 -0.30 -1.31
f.ollie 9 14 4.00 1.04 4.0 4.08 1.48 2 5 3 -0.77 -0.66
f.shove 10 14 3.64 0.74 3.5 3.58 0.74 3 5 2 0.58 -1.13
f.fs.shove 11 14 1.71 0.99 2.0 1.75 1.48 0 3 3 -0.34 -1.08
sw.fs.bone 12 14 4.36 1.01 5.0 4.50 0.00 2 5 3 -1.10 -0.29
sw.fs.no.comply 13 14 3.14 1.10 3.0 3.17 1.48 1 5 4 -0.26 -1.01
f.bigspin 14 14 0.79 0.80 1.0 0.75 1.48 0 2 2 0.35 -1.48
sw.bone 15 14 4.43 0.76 5.0 4.50 0.00 3 5 2 -0.77 -0.96
halfcab 16 14 4.71 0.61 5.0 4.83 0.00 3 5 2 -1.72 1.72
fingerflip 17 14 2.64 1.39 2.0 2.58 1.48 1 5 4 0.44 -1.27
sw.no.cmp 18 14 3.29 0.99 3.5 3.25 0.74 2 5 3 -0.10 -1.46
date* 19 14 21.50 4.18 21.5 21.50 5.19 15 28 13 0.00 -1.46
place* 20 14 3.21 1.89 3.0 3.17 2.97 1 6 5 0.16 -1.60
randomized* 21 14 2.00 0.00 2.0 2.00 0.00 2 2 0 NaN NaN
board* 22 14 2.00 0.00 2.0 2.00 0.00 2 2 0 NaN NaN
se
ollie 0.00
fs.180 0.23
bs.180 0.27
pop.shove 0.32
fs.shove 0.20
kickflip 0.00
heelflip 0.00
f.180 0.27
f.ollie 0.28
f.shove 0.20
f.fs.shove 0.27
sw.fs.bone 0.27
sw.fs.no.comply 0.29
f.bigspin 0.21
sw.bone 0.20
halfcab 0.16
fingerflip 0.37
sw.no.cmp 0.27
date* 1.12
place* 0.50
randomized* 0.00
board* 0.00
# the codebook has more information about what O and 1 means
describeBy(skate_df, group = skate_df$randomized)
Descriptive statistics by group
group: 0
vars n mean sd median trimmed mad min max range skew kurtosis
ollie 1 10 4.8 0.63 5.0 5.00 0.00 3 5 2 -2.28 3.57
fs.180 2 10 3.3 0.95 3.0 3.25 1.48 2 5 3 0.17 -1.17
bs.180 3 10 2.6 1.17 2.5 2.62 1.48 1 4 3 -0.03 -1.68
pop.shove 4 10 2.3 1.34 2.5 2.38 1.48 0 4 4 -0.24 -1.40
fs.shove 5 10 0.0 0.00 0.0 0.00 0.00 0 0 0 NaN NaN
kickflip 6 10 0.0 0.00 0.0 0.00 0.00 0 0 0 NaN NaN
heelflip 7 10 0.0 0.00 0.0 0.00 0.00 0 0 0 NaN NaN
f.180 8 10 2.0 0.94 2.0 2.12 0.74 0 3 3 -0.72 -0.47
f.ollie 9 10 3.6 0.70 4.0 3.75 0.00 2 4 2 -1.19 -0.07
f.shove 10 10 3.8 0.79 4.0 3.75 1.48 3 5 2 0.29 -1.50
f.fs.shove 11 10 1.2 0.79 1.0 1.25 1.48 0 2 2 -0.29 -1.50
sw.fs.bone 12 10 4.6 0.52 5.0 4.62 0.00 4 5 1 -0.35 -2.06
sw.fs.no.comply 13 10 3.5 1.08 3.5 3.50 0.74 2 5 3 0.00 -1.48
f.bigspin 14 10 0.7 0.67 1.0 0.62 0.74 0 2 2 0.31 -1.14
sw.bone 15 10 4.2 0.92 4.0 4.38 0.74 2 5 3 -1.11 0.52
halfcab 16 10 4.1 0.74 4.0 4.12 0.74 3 5 2 -0.12 -1.35
fingerflip 17 10 2.7 1.16 3.0 2.75 1.48 1 4 3 -0.25 -1.57
sw.no.cmp 18 10 3.1 1.10 3.0 3.12 0.74 1 5 4 -0.17 -0.60
date* 19 10 5.8 2.66 5.5 5.75 2.97 2 10 8 0.16 -1.53
place* 20 10 3.1 1.85 3.0 3.00 2.22 1 6 5 0.34 -1.30
randomized* 21 10 1.0 0.00 1.0 1.00 0.00 1 1 0 NaN NaN
board* 22 10 1.0 0.00 1.0 1.00 0.00 1 1 0 NaN NaN
se
ollie 0.20
fs.180 0.30
bs.180 0.37
pop.shove 0.42
fs.shove 0.00
kickflip 0.00
heelflip 0.00
f.180 0.30
f.ollie 0.22
f.shove 0.25
f.fs.shove 0.25
sw.fs.bone 0.16
sw.fs.no.comply 0.34
f.bigspin 0.21
sw.bone 0.29
halfcab 0.23
fingerflip 0.37
sw.no.cmp 0.35
date* 0.84
place* 0.59
randomized* 0.00
board* 0.00
---------------------------------------------------------------
group: 1
vars n mean sd median trimmed mad min max range skew kurtosis
ollie 1 19 5.00 0.00 5 5.00 0.00 5 5 0 NaN NaN
fs.180 2 19 3.21 0.98 3 3.24 1.48 1 5 4 -0.40 -0.49
bs.180 3 19 3.16 1.12 3 3.18 1.48 1 5 4 -0.29 -0.56
pop.shove 4 19 2.11 1.10 2 2.12 1.48 0 4 4 -0.43 -0.80
fs.shove 5 19 0.37 0.68 0 0.29 0.00 0 2 2 1.44 0.58
kickflip 6 19 0.00 0.00 0 0.00 0.00 0 0 0 NaN NaN
heelflip 7 19 0.00 0.00 0 0.00 0.00 0 0 0 NaN NaN
f.180 8 19 2.05 1.13 2 2.06 1.48 0 4 4 -0.32 -1.04
f.ollie 9 19 3.84 1.01 4 3.88 1.48 2 5 3 -0.61 -0.79
f.shove 10 19 3.53 1.02 3 3.53 1.48 2 5 3 0.08 -1.25
f.fs.shove 11 19 1.68 0.95 2 1.71 1.48 0 3 3 -0.13 -1.06
sw.fs.bone 12 19 4.47 0.90 5 4.59 0.00 2 5 3 -1.42 0.77
sw.fs.no.comply 13 19 3.00 1.00 3 3.00 1.48 1 5 4 0.00 -0.79
f.bigspin 14 19 0.68 0.75 1 0.65 1.48 0 2 2 0.52 -1.16
sw.bone 15 19 4.42 0.90 5 4.53 0.00 2 5 3 -1.29 0.50
halfcab 16 19 4.58 0.84 5 4.71 0.00 2 5 3 -1.85 2.47
fingerflip 17 19 2.74 1.33 2 2.71 1.48 1 5 4 0.46 -1.08
sw.no.cmp 18 19 3.05 1.08 3 3.06 1.48 1 5 4 -0.10 -1.25
date* 19 19 18.53 6.70 19 19.00 7.41 1 28 27 -0.69 0.14
place* 20 19 3.26 1.79 3 3.24 2.97 1 6 5 0.07 -1.51
randomized* 21 19 2.00 0.00 2 2.00 0.00 2 2 0 NaN NaN
board* 22 19 1.74 0.45 2 1.76 0.00 1 2 1 -0.99 -1.06
se
ollie 0.00
fs.180 0.22
bs.180 0.26
pop.shove 0.25
fs.shove 0.16
kickflip 0.00
heelflip 0.00
f.180 0.26
f.ollie 0.23
f.shove 0.23
f.fs.shove 0.22
sw.fs.bone 0.21
sw.fs.no.comply 0.23
f.bigspin 0.17
sw.bone 0.21
halfcab 0.19
fingerflip 0.30
sw.no.cmp 0.25
date* 1.54
place* 0.41
randomized* 0.00
board* 0.10
After dissecting the data set a bit. Not much cleaning is needed since i preprocessed the columns in pandas especially the date column. Notice, i did some backfilling which wasn’t really necessary but i didn’t want to lose that row. So to avoid losing it, i pushed the date “backwards” 2016-07-29 appears twice; A necessary conversion. I also wanted to make different variations of the data set; We’re gonna side step that for now. Let’s draw some plots. :)
# where did i skate the most? describe already told us this.
# Why is it decimals and not whole numbers?
ggplot(skate_df, aes(x = place)) + geom_bar(fill = "black", col = "black") +
labs(x = "place", y = "number of times i skated",
title = "Plot of how many times i skated somewhere", caption = "mostly marist") + theme_classic()# theme_classic overides the classic ggplot themeThe funnest part of the analysis beckons. Taking the columns of interest doing basic math finding the sum of each column representing a trick being done in sum1 variable. Then dividing what we’ll get with total if i got the tricks right all the time. So that makes the number 145(29 times 5), then we’ll divide each item by 145 (see the variable prob_trick). Furthermore, we’ll find the standard deviation for each item using sapply - which applies a function to the columns we just made (var_trick)
# select the columns to use later in the analysis
skate_df_cols <- skate_df %>% select(ollie, fs.180, bs.180, pop.shove,fs.shove, kickflip, heelflip,f.180,f.ollie,f.shove,f.fs.shove, sw.fs.bone, sw.fs.no.comply,f.bigspin, sw.bone,halfcab,fingerflip,sw.no.cmp)
(sum1 <- skate_df_cols %>% summarise_all(funs(sum))) # find sum of each trick(sds <- skate_df_cols %>% summarise_all(funs(sd))) # use brackets instead of calling the # variable a second time or using a print statement
class(sum1) # what's the class of the resultant data set[1] "tbl_df" "tbl" "data.frame"
# This function divides x being the row item that is, the trick ollie
# by the sum of trick scores if i got them all down.
divide <- function(x,y){
divide_by(x,145)
}
sum2 <- sapply(sum1,divide)
#prob_trick <- round(sort(sum2,decreasing = TRUE),2) another way of doing it
prob_trick <- sum2 %>% sort(decreasing = TRUE) %>% round(2)
# we can't do much with standard deviation so let's try what we've done for prob_trick
sds2 <- sapply(skate_df_cols,sd)
#var_trick <- round(sort(sum2,decreasing = TRUE),2))
(var_trick <- sds2 %>% sort() %>% round(2)) kickflip heelflip ollie fs.shove f.bigspin
0.00 0.00 0.37 0.58 0.71
sw.fs.bone halfcab sw.bone f.fs.shove f.ollie
0.78 0.82 0.90 0.91 0.91
f.shove fs.180 sw.fs.no.comply f.180 sw.no.cmp
0.94 0.95 1.04 1.05 1.07
bs.180 pop.shove fingerflip
1.15 1.17 1.25
med <- sapply(skate_df_cols,median) # calculating medians again
#med_trick <- round(sort(med,decreasing = TRUE),2))
(med_trick <- med %>% sort(decreasing = TRUE)) ollie sw.fs.bone sw.bone halfcab f.ollie
5 5 5 5 4
f.shove fs.180 bs.180 sw.fs.no.comply fingerflip
4 3 3 3 3
sw.no.cmp pop.shove f.180 f.fs.shove f.bigspin
3 2 2 2 1
fs.shove kickflip heelflip
0 0 0
prob_trick # we're comparing what we got to see if this can tell me what trick i should have ollie sw.fs.bone halfcab sw.bone f.ollie
0.99 0.90 0.88 0.87 0.75
f.shove fs.180 sw.fs.no.comply sw.no.cmp bs.180
0.72 0.65 0.63 0.61 0.59
fingerflip pop.shove f.180 f.fs.shove f.bigspin
0.54 0.43 0.41 0.30 0.14
fs.shove kickflip heelflip
0.05 0.00 0.00
# more confidence in games of S.K.A.T.E or H.O.R.S.E.
# So from this, this is the arrangement we're getting from probabilities but the median
# gives a tie, the plots could give us a clues how to proceed or confidence intervals
trick_line_up <- c('ollie','sw.fs.bone','halfcab','sw.bone','f.ollie','f.shove','fs.180','bs.180','sw.fs.no.comply','sw.no.cmp','bs.180','fingerflip','pop.shove','f.180','f.fs.shove','f.bigspin','fs.shove')We can’t use multiplot function from the scater package by Davis McCarthy and the grid.arrange() function from gridExtra shrinks the plots and we won’t be able to see the plots nicely. This one by one, approach is not good for comparison. If you have a better way of doing this holler! Before, i used par which did it very well.
# trying to make a plot with summary stats
#plot(skate$fs180, main = "Freq of Fs180 as days passed", xlab = "day",ylab = "Frequency",type = "both")
ggplot(skate_df, aes(x = ymd(date), y = ollie)) + geom_point() +
geom_line() + geom_hline(data = skate_df, aes(yintercept = median(ollie), color = "red")) + labs(x = "day", y = "Frequency",
title = "Freq of ollie as days passed", caption = "More confident to start with Ollie") + scale_color_manual(name = "median", values = "red") + theme_classic()ggplot(skate_df, aes(x = ymd(date), y = fs.180)) + geom_point() +
geom_line() + geom_hline(data = skate_df, aes(yintercept = median(fs.180), col = "red")) + labs(x = "day", y = "Frequency",
title = "Freq of fs.180 as days passed", caption = "fs.180 is shaky") + scale_color_manual(name = "median", values = "red") + theme_classic()ggplot(skate_df, aes(x = ymd(date), y = bs.180)) + geom_point() +
geom_line() + geom_hline(data = skate_df, aes(yintercept = median(bs.180), col = "red")) + labs(x = "day", y = "Frequency",
title = "Freq of bs.180 as days passed", caption = "bs.180 are a bit similiar median is equal") + scale_color_manual(name = "median", values = "red") + theme_classic()# we can use grid arrange on a separate code chunks
#-----------------------------------------------------------------------------------------
# it's time we made a function
# Instead of writing the code over and over again i made a function that just
# requires you to type something y_axis_label (The y axis coordinate), y_intercept_med(the median of the trick), title_of_graph and caption_of_graph
plot_gg_tricks <- function(y_axis_label,y_intercept_med,title_of_graph = readline(), caption_of_graph = readline()){
ggplot(skate_df, aes(x = ymd(date), y = y_axis_label)) + geom_point() +
geom_line() + geom_hline(data = skate_df, aes(yintercept = median(y_intercept_med), col = "red")) + labs(x = "day", y = "Frequency",
title = title_of_graph, caption = caption_of_graph) + scale_color_manual(name = "median", values = "red") + theme_classic()
}
# pop.shove
plot_gg_tricks(y_axis_label = skate_df$pop.shove,y_intercept_med = skate_df$pop.shove,"Freq of pop.shove as days passed", "pop.shove come along way not so confident with this.")#fs.shove
plot_gg_tricks(y_axis_label = skate_df$fs.shove,y_intercept_med = skate_df$fs.shove,"Freq of fs.shove as days passed", "fs.shove still learning it notice the many zeros at the start.")#kickflip
plot_gg_tricks(y_axis_label = skate_df$kickflip,y_intercept_med = skate_df$kickflip,"Freq of kickflip as days passed", "kickflip didn't do any")# heelflip
plot_gg_tricks(y_axis_label = skate_df$heelflip,y_intercept_med = skate_df$heelflip,"Freq of heelflip as days passed", "heelflip didn't do any")# f.180
plot_gg_tricks(y_axis_label = skate_df$f.180,y_intercept_med = skate_df$f.180,"Freq of f.180 as days passed", "f.180 is confusing summary statistics to the rescue.")#f.ollie
plot_gg_tricks(y_axis_label = skate_df$f.ollie,y_intercept_med = skate_df$f.ollie,"Freq of f.ollie as days passed", "f.ollie i'm improving at this one of my top 5 tricks")# f.shove
plot_gg_tricks(y_axis_label = skate_df$f.shove,y_intercept_med = skate_df$f.shove,"Freq of f.shove as days passed", "f.shove looking good.")#f.fs.shove
plot_gg_tricks(y_axis_label = skate_df$f.fs.shove,y_intercept_med = skate_df$f.fs.shove,"Freq of f.fs.shove as days passed", "f.fs.shove learning this one as well never done it more than thrice.")# sw.fs.bone
plot_gg_tricks(y_axis_label = skate_df$sw.fs.bone,y_intercept_med = skate_df$sw.fs.bone,"Freq of sw.fs.bone as days passed", "sw.fs.bone looks like a top 5 trick.")# sw.fs.no.comply
plot_gg_tricks(y_axis_label = skate_df$sw.fs.no.comply,y_intercept_med = skate_df$sw.fs.no.comply,"Freq of sw.fs.no.comply as days passed", "sw.fs.no.comply seems that i can improve this")#f.bigspin
plot_gg_tricks(y_axis_label = skate_df$f.bigspin,y_intercept_med = skate_df$f.bigspin,"Freq of f.bigspin as days passed", "f.bigspin very bad at these max times done 2.")#sw.bone
plot_gg_tricks(y_axis_label = skate_df$sw.bone,y_intercept_med = skate_df$sw.bone,"Freq of sw.bone as days passed", "sw.bone another good trick to start with look at those 5's")#halfcab
plot_gg_tricks(y_axis_label = skate_df$halfcab,y_intercept_med = skate_df$halfcab,"Freq of halfcab as days passed", "halfcab started shaky but stabilizing slowly need more data.")# fingerflip
plot_gg_tricks(y_axis_label = skate_df$fingerflip,y_intercept_med = skate_df$fingerflip,"Freq of fingerflip as days passed", "fingerflip i learned this from playing a videogame and did it.")#sw.no.cmp
plot_gg_tricks(y_axis_label = skate_df$sw.no.cmp, y_intercept_med = skate_df$sw.no.cmp, "Freq of sw.no.cmp as days passed", "sw.no.cmp looks like i'm better off doing this first than the fingerflip.")The data doesn’t meet the requirements of the central limit theorem but we can try make confidence intervals with boot strapping(sampling within a sample) to see what we get especially where they were ties. About what trick to do next. Let me try with ggplot2. We’ll use summary functions from the Hmisc package specifically the smean.cl.boot function and smedian.hilow. If you’re not acquainted to these functions use this in the console to find out what they do ?smean.cl.boot and ?smedian.hilow. The green line represents the range plausible range of means if we did this study again 1000 times for example sw.fs.bone(switch frontside bone) between 4 and 5 times whereas, the green dot represents the mean. This is so cool!
# remember we made this, let's see if it still holds
(trick_line_up) [1] "ollie" "sw.fs.bone" "halfcab" "sw.bone"
[5] "f.ollie" "f.shove" "fs.180" "bs.180"
[9] "sw.fs.no.comply" "sw.no.cmp" "bs.180" "fingerflip"
[13] "pop.shove" "f.180" "f.fs.shove" "f.bigspin"
[17] "fs.shove"
# this is a graphical statistical visualization
# according to the documentation mean.cl.boot is a very fast implementation of the basic nonparametric bootstrap for obtaining confidence limits for the population mean without assuming normality
# I don't recommend this test since i'm getting errors about the geom_pointrange not being
# specified. If you have a better of doing this let me know. The fastest way to reach me is twitter or youtube
plot_gg_tricks2 <- function(y_axis_label,title_of_graph = readline(), caption_of_graph = readline()){
ggplot(skate_df, aes(x = ymd(date), y = y_axis_label)) + geom_point() +
geom_line() + stat_summary(fun.data = mean_cl_boot, col = "green") + labs(x = "day", y = "Frequency", title = title_of_graph, caption = caption_of_graph) + theme_classic()
}
# Less repetition i suppose and typing
# You know the drill, this time it will based on the tricks that we found are good start
# with based on median and probability
# ollie
# Cl means confidence interval i'm using 0.95 the default and bootstrapped 1000 times
plot_gg_tricks2(y_axis_label = skate_df$ollie, title_of_graph = 'Freq of ollie as days passed with mean CI',caption_of_graph = "start with ollie got it.")# sw.fs.bone
plot_gg_tricks2(y_axis_label = skate_df$sw.fs.bone, title_of_graph = 'Freq of ollie as days passed with mean CI',caption_of_graph = "Then do a sw.fs.bone.")# sw.bone
plot_gg_tricks2(y_axis_label = skate_df$sw.bone, title_of_graph = 'Freq of sw.bone as days passed with mean CI',caption_of_graph = "so we have a tie, i'm more confident with sw.fs.bone.")# f.ollie
plot_gg_tricks2(y_axis_label = skate_df$f.ollie, title_of_graph = 'Freq of f.ollie as days passed with mean CI',caption_of_graph = "i had a feeling this one of my top 5.")# f.shove
plot_gg_tricks2(y_axis_label = skate_df$f.shove, title_of_graph = 'Freq of f.shove as days passed with mean CI',caption_of_graph = "yup, this one should be next after f.ollie.")# fs.180
plot_gg_tricks2(y_axis_label = skate_df$fs.180, title_of_graph = 'Freq of fs.180 as days passed with mean CI',caption_of_graph = "I learned this first, so naturally, i do it first.")# bs.180
plot_gg_tricks2(y_axis_label = skate_df$bs.180, title_of_graph = 'Freq of bs.180 as days passed with mean CI',caption_of_graph = "either way sort of trick, can do between 2/4.")# sw.fs.no.comply
plot_gg_tricks2(y_axis_label = skate_df$sw.fs.no.comply, title_of_graph = 'Freq of sw.fs.no.comply as days passed with mean CI',caption_of_graph = "ties need more data for this one.")# sw.no.comp
plot_gg_tricks2(y_axis_label = skate_df$sw.no.cmp, title_of_graph = 'Freq of sw.no.cmp as days passed with mean CI',caption_of_graph = "ties need more data for this one.")# fingerflip
plot_gg_tricks2(y_axis_label = skate_df$fingerflip, title_of_graph = 'Freq of fingerflip as days passed with mean CI',caption_of_graph = "needs more data.")# pop shove
plot_gg_tricks2(y_axis_label = skate_df$pop.shove, title_of_graph = 'Freq of pop.shove as days passed with mean CI',caption_of_graph = "pop shove is better than my since the limits are better numbers.")# f.180
plot_gg_tricks2(y_axis_label = skate_df$f.180, title_of_graph = 'Freq of f.180 as days passed with mean CI',caption_of_graph = "you can do it twice with confidence.")# f.fs.shove
plot_gg_tricks2(y_axis_label = skate_df$f.fs.shove, title_of_graph = 'Freq of f.fs.shove as days passed with mean CI',caption_of_graph = "this trick belongs here. I concur with this.")# f.bigspin
plot_gg_tricks2(y_axis_label = skate_df$f.bigspin, title_of_graph = 'Freq of f.bigspin as days passed with mean CI',caption_of_graph = "Needs more training.")# fs.shove
plot_gg_tricks2(y_axis_label = skate_df$fs.shove, title_of_graph = 'Freq of fs.shove as days passed with mean CI',caption_of_graph = "yeah, this one should be the last.Those zeros prove it.")# do for the trick tie breakers that is smedian.hilow, with quantile calculation
# we're expecting less variability is better
# read the docs abouts smedidan.hilow()
# I wanted to do this test instead check it out ?wilcoxtest()
print("ollie");smedian.hilow(skate_df$ollie) #1[1] "ollie"
Median Lower Upper
5.0 4.4 5.0
print("sw.fs.bone"); smedian.hilow(skate_df$sw.fs.bone) #2[1] "sw.fs.bone"
Median Lower Upper
5.0 2.7 5.0
print("halfcab"); smedian.hilow(skate_df$halfcab) #3[1] "halfcab"
Median Lower Upper
5.0 2.7 5.0
print("sw.bone"); smedian.hilow(skate_df$sw.bone) #4[1] "sw.bone"
Median Lower Upper
5 2 5
print("f.ollie"); smedian.hilow(skate_df$f.ollie) #5[1] "f.ollie"
Median Lower Upper
4 2 5
print("f.shove"); smedian.hilow(skate_df$f.shove) #6[1] "f.shove"
Median Lower Upper
4 2 5
print("fs.180"); smedian.hilow(skate_df$fs.180) #7[1] "fs.180"
Median Lower Upper
3.0 1.7 5.0
print("bs.180"); smedian.hilow(skate_df$bs.180) #8[1] "bs.180"
Median Lower Upper
3 1 5
print("sw.fs.no.comply"); smedian.hilow(skate_df$sw.fs.no.comply) #8[1] "sw.fs.no.comply"
Median Lower Upper
3.0 1.7 5.0
print("sw.no.cmp"); smedian.hilow(skate_df$sw.no.cmp) #9[1] "sw.no.cmp"
Median Lower Upper
3 1 5
print("fingerflip"); smedian.hilow(skate_df$fingerflip) #10[1] "fingerflip"
Median Lower Upper
3 1 5
print("pop.shove"); smedian.hilow(skate_df$pop.shove) #12[1] "pop.shove"
Median Lower Upper
2 0 4
print("f.180"); smedian.hilow(skate_df$f.180) #11[1] "f.180"
Median Lower Upper
2.0 0.0 3.3
print("f.fs.shove"); smedian.hilow(skate_df$f.fs.shove) #13[1] "f.fs.shove"
Median Lower Upper
2 0 3
print("f.bigspin"); smedian.hilow(skate_df$f.bigspin) #14[1] "f.bigspin"
Median Lower Upper
1 0 2
print("fs.shove"); smedian.hilow(skate_df$fs.shove) #15[1] "fs.shove"
Median Lower Upper
0 0 2
One interesting thing, i should point out is that I’m not confident with the following tricks fakie bigspin(f.bigspin), frontside shove(fs.shove) and sometimes fakie frontside shove(f.fs.shove). Yet with our numerous tests this is evident. We used data to confirm this! That makes me excited. Otherwise, after inspecting everything this is how i think my line up of tricks should be with some freedom to alternate ollie,sw.fs.bone,halfcab,sw.bone,f.ollie,f.shove,fs.180,bs.180,sw.fs.no.comply,sw.no.cmp,fingerflip,pop.shove,f.180,f.fs.shove, f.bigspin, fs.shove. I’ll proceed to test this line up with two constraints just flatground at marist lane and with a game of S.K.A.T.E with a friend. Thinking of doing tests that involve p values soon.
R Core Team (2016). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. URL https://www.R-project.org/.
Revelle, W. (2017) psych: Procedures for Personality and Psychological Research, Northwestern University, Evanston, Illinois, USA,https://CRAN.R-project.org/package=psych Version = 1.7.5.
H.Wickham. ggplot2: Elegant Graphics for Data Analysis. Springer-Verlag New York, 2009.
Stefan Milton Bache and Hadley Wickham (2014). magrittr: A Forward-Pipe Operator for R. R package version 1.5. https://CRAN.R-project.org/package=magrittr.
Hadley Wickham, Romain Francois, Lionel Henry and Kirill Müller (2017). dplyr: A Grammar of Data Manipulation. R package version 0.7.2. https://CRAN.R-project.org/package=dplyr.
Garrett Grolemund, Hadley Wickham (2011). Dates and Times Made Easy with lubridate. Journal of Statistical Software, 40(3), 1-25. URL http://www.jstatsoft.org/v40/i03/.
Hadley Wickham and Lionel Henry (2017). tidyr: Easily Tidy Data with ‘spread()’ and ‘gather()’ Functions. R package version 0.7.0. https://CRAN.R-project.org/package=tidyr
Frank E Harrell Jr, with contributions from Charles Dupont and many others. (2017). Hmisc: Harrell Miscellaneous. R package version 4.0-3. https://CRAN.R-project.org/package=Hmisc
Baptiste Auguie (2016). gridExtra: Miscellaneous Functions for “Grid” Graphics. R package version 2.2.1. https://CRAN.R-project.org/package=gridExtra
https://www.datacamp.com/courses/data-visualization-with-ggplot2-1
https://www.datacamp.com/courses/data-visualization-with-ggplot2-2
https://stackoverflow.com/questions/26890354/lineplot-legend-abline-ggplot
https://stackoverflow.com/questions/5226807/multiple-graphs-in-one-canvas-using-ggplot2
sessionInfo()R version 3.3.2 (2016-10-31)
Platform: x86_64-apple-darwin13.4.0 (64-bit)
Running under: OS X El Capitan 10.11.6
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] gridExtra_2.2.1 Hmisc_4.0-3 Formula_1.2-2 survival_2.41-3 lattice_0.20-35
[6] tidyr_0.7.1 lubridate_1.6.0 dplyr_0.7.2 magrittr_1.5 ggplot2_2.2.1
[11] psych_1.7.5
loaded via a namespace (and not attached):
[1] Rcpp_0.12.12 RColorBrewer_1.1-2 plyr_1.8.4 bindr_0.1
[5] base64enc_0.1-3 tools_3.3.2 rpart_4.1-11 digest_0.6.12
[9] checkmate_1.8.3 htmlTable_1.9 jsonlite_1.5 evaluate_0.10.1
[13] tibble_1.3.4 gtable_0.2.0 nlme_3.1-131 pkgconfig_2.0.1
[17] rlang_0.1.2 Matrix_1.2-11 yaml_2.1.14 parallel_3.3.2
[21] bindrcpp_0.2 cluster_2.0.6 stringr_1.2.0 knitr_1.17
[25] htmlwidgets_0.9 nnet_7.3-12 rprojroot_1.2 grid_3.3.2
[29] data.table_1.10.4 glue_1.1.1 R6_2.2.2 foreign_0.8-69
[33] rmarkdown_1.6 latticeExtra_0.6-28 purrr_0.2.3 backports_1.1.0
[37] scales_0.5.0 htmltools_0.3.6 splines_3.3.2 assertthat_0.2.0
[41] mnormt_1.5-5 colorspace_1.3-2 labeling_0.3 stringi_1.1.5
[45] acepack_1.4.1 lazyeval_0.2.0 munsell_0.4.3