Data are extracted from the Track Your Turns website at Mt. Bachelor. For this particular pass there are 2700 total runs.
Let’s look at a sample of the ski runs data:
print_sample<-sort(sample(1:dim(ski_runs)[1], 8))
print(ski_runs[print_sample,2:8])
## chair vertical_feet vertical_meters year month day time_of_day
## 186 Northwest 2377 724 2006 3 30 11:44:04
## 208 Northwest 2377 724 2006 12 2 11:50:00
## 378 Outback 1780 542 2007 3 25 10:21:00
## 385 Summit 1709 520 2007 3 29 09:31:00
## 758 Pine Marten 1367 416 2008 12 28 12:01:00
## 774 Sunrise 808 246 2008 12 29 12:07:00
## 1301 Northwest 2377 724 2010 1 31 11:24:17
## 2137 Northwest 2377 724 2013 1 5 12:08:37
From this data add a few extra columns of data as a start of features such as weekday, weekend, and holiday indicators to increase the feature set available for plotting.
We can slice and dice the data many ways. For this exploratory part we’ll look at a couple of them just to get “dirty” with the data.
One way, as below, is to sum vertical feet of runs to get total vertical feet skied per day.
daily_vert<-rowsum(ski_runs$vertical_feet, ski_runs$date)
Plotted as below, shows the daily total vert plotted as a function of “Ski Season Week” (i.e. weeks since Nov 1). The one point at week 40 is a high snow year, when they area opened on July 4th.
Another way to look at the data is as a histogram of daily total vertical for all ski days.
The mean vertical is 19691 feet over 211 days. The most frequent daily total is about 22000 feet / day.
This is an interesting way to look at individual ski days. Vertical feet for the day are encoded in color, with red representing the highest dailty vert.
This is not a particularly useful way of looking at the data since it doesn’t seem to provide much insight.
Here we begin to look at the data in somewhat finer detail. For instance we can look at the total vertical feet skied on a particular chairlift per ski day.
For the Pine Marten chair lift.
Pine_runs <- ski_runs[ski_runs$chair=="Pine Marten",]
Sunrise_runs <- ski_runs[ski_runs$chair=="Sunrise",]
And as a last version, let’s just look at all of the sunrise totals. But rather than look at just the numbers, let’s look at the z-statistic for totals to look for outliers. Now this is starting to look likesomething useful…
## total_vertical date week season_week wday year count
## 2005-12-10 808 2005-12-10 50 6 6 2005 1
## 2005-12-28 3232 2005-12-28 52 9 3 2005 2
## 2005-12-30 4040 2005-12-30 52 9 5 2005 3
## 2006-01-28 808 2006-01-28 4 13 6 2006 4
## 2006-02-20 808 2006-02-20 8 16 1 2006 5
## 2006-02-25 808 2006-02-25 8 17 6 2006 6
We can see, however, that the distribution has a very long tail.
This seems like we might be getting somewhere. This analysis emphasizes there were a couple of anomalous days in the 2008-2009 season..
We can look if there are patterns in the charilift usage by looking for trends and patterns in the data on which chairs are ridden…
First, let’s just look at the number of rides on all chairs
We can use this to test each day against this pattern…
## [1] 17.53247
## Var1 Freq normalized_runs run_sq_dev average_runs ski_date
## 1 Northwest 2 0.614 7.001316 3.260 2005-12-09
## 2 Outback 0 0.000 4.787344 2.188 2005-12-09
## 3 Pine Marten 6 1.644 4.020025 3.649 2005-12-09
## 4 Rainbow 0 0.000 0.962361 0.981 2005-12-09
## 5 Red 0 0.000 0.004225 0.065 2005-12-09
## 6 Skyliner 1 0.474 2.676496 2.110 2005-12-09
## run_sq_dev date week season_day wday year season
## 2005-12-09 28.63136 2005-12-09 49 9 5 2005 05-06
## 2005-12-10 28.91979 2005-12-10 50 10 6 2005 05-06
## 2005-12-28 141.40010 2005-12-28 52 28 3 2005 05-06
## 2005-12-29 42.01020 2005-12-29 52 29 4 2005 05-06
## 2005-12-30 33.95522 2005-12-30 52 30 5 2005 05-06
## 2006-01-01 37.20077 2006-01-01 1 32 0 2006 05-06
Let’s explore graphically how frequently chairlifts are used throughout the season.
## [1] "Run list"
## Var1 Freq normalized_runs run_sq_dev average_runs ski_date
## 1 Northwest 2 0.614 7.001316 3.260 2005-12-09
## 2 Outback 0 0.000 4.787344 2.188 2005-12-09
## 3 Pine Marten 6 1.644 4.020025 3.649 2005-12-09
## 4 Rainbow 0 0.000 0.962361 0.981 2005-12-09
## 5 Red 0 0.000 0.004225 0.065 2005-12-09
## 6 Skyliner 1 0.474 2.676496 2.110 2005-12-09
## 7 Summit 3 1.276 1.155625 2.351 2005-12-09
## 8 Sunrise 0 0.000 8.014561 2.831 2005-12-09
## 9 Sunshine 0 0.000 0.009409 0.097 2005-12-09
## 10 Northwest 3 0.920 5.475600 3.260 2005-12-10
## 11 Outback 5 2.285 0.009409 2.188 2005-12-10
## 12 Pine Marten 2 0.548 9.616201 3.649 2005-12-10
## season week season_day wday year
## 1 05-06 49 39 5 2005
## 2 05-06 49 39 5 2005
## 3 05-06 49 39 5 2005
## 4 05-06 49 39 5 2005
## 5 05-06 49 39 5 2005
## 6 05-06 49 39 5 2005
## 7 05-06 49 39 5 2005
## 8 05-06 49 39 5 2005
## 9 05-06 49 39 5 2005
## 10 05-06 50 40 6 2005
## 11 05-06 50 40 6 2005
## 12 05-06 50 40 6 2005
Let’s zoom in on a couple of seasons for fun.
as an alternative way to look at things