The hflights dataset contains all flights departing from Houston airports IAH (George Bush Intercontinental) and HOU (Houston Hobby) in 2011. The data comes from the Research and Innovation Technology Administration at the Bureau of Transportation statistics. These are the variables or columns in the dataset.
Variables
We can take our first look at the hflights dataset with R’s Head and Tail functions. We can follow up by looking at the structure and summary of the data. Notice the mostly meaningless summary information on numeric fields like year, month, flightnum, cancelled, diverted and so on.
library(hflights)
head(hflights)
## Year Month DayofMonth DayOfWeek DepTime ArrTime UniqueCarrier
## 5424 2011 1 1 6 1400 1500 AA
## 5425 2011 1 2 7 1401 1501 AA
## 5426 2011 1 3 1 1352 1502 AA
## 5427 2011 1 4 2 1403 1513 AA
## 5428 2011 1 5 3 1405 1507 AA
## 5429 2011 1 6 4 1359 1503 AA
## FlightNum TailNum ActualElapsedTime AirTime ArrDelay DepDelay Origin
## 5424 428 N576AA 60 40 -10 0 IAH
## 5425 428 N557AA 60 45 -9 1 IAH
## 5426 428 N541AA 70 48 -8 -8 IAH
## 5427 428 N403AA 70 39 3 3 IAH
## 5428 428 N492AA 62 44 -3 5 IAH
## 5429 428 N262AA 64 45 -7 -1 IAH
## Dest Distance TaxiIn TaxiOut Cancelled CancellationCode Diverted
## 5424 DFW 224 7 13 0 0
## 5425 DFW 224 6 9 0 0
## 5426 DFW 224 5 17 0 0
## 5427 DFW 224 9 22 0 0
## 5428 DFW 224 9 9 0 0
## 5429 DFW 224 6 13 0 0
tail(hflights)
## Year Month DayofMonth DayOfWeek DepTime ArrTime UniqueCarrier
## 6083254 2011 12 6 2 1307 1600 WN
## 6083255 2011 12 6 2 1818 2111 WN
## 6083256 2011 12 6 2 2047 2334 WN
## 6083257 2011 12 6 2 912 1031 WN
## 6083258 2011 12 6 2 656 812 WN
## 6083259 2011 12 6 2 1600 1713 WN
## FlightNum TailNum ActualElapsedTime AirTime ArrDelay DepDelay
## 6083254 471 N632SW 113 98 0 7
## 6083255 1191 N284WN 113 97 -9 8
## 6083256 1674 N366SW 107 94 4 7
## 6083257 127 N777QC 79 61 -4 -3
## 6083258 621 N727SW 76 64 -13 -4
## 6083259 1597 N745SW 73 59 -12 0
## Origin Dest Distance TaxiIn TaxiOut Cancelled CancellationCode
## 6083254 HOU TPA 781 5 10 0
## 6083255 HOU TPA 781 5 11 0
## 6083256 HOU TPA 781 4 9 0
## 6083257 HOU TUL 453 4 14 0
## 6083258 HOU TUL 453 3 9 0
## 6083259 HOU TUL 453 3 11 0
## Diverted
## 6083254 0
## 6083255 0
## 6083256 0
## 6083257 0
## 6083258 0
## 6083259 0
str(hflights)
## 'data.frame': 227496 obs. of 21 variables:
## $ Year : int 2011 2011 2011 2011 2011 2011 2011 2011 2011 2011 ...
## $ Month : int 1 1 1 1 1 1 1 1 1 1 ...
## $ DayofMonth : int 1 2 3 4 5 6 7 8 9 10 ...
## $ DayOfWeek : int 6 7 1 2 3 4 5 6 7 1 ...
## $ DepTime : int 1400 1401 1352 1403 1405 1359 1359 1355 1443 1443 ...
## $ ArrTime : int 1500 1501 1502 1513 1507 1503 1509 1454 1554 1553 ...
## $ UniqueCarrier : chr "AA" "AA" "AA" "AA" ...
## $ FlightNum : int 428 428 428 428 428 428 428 428 428 428 ...
## $ TailNum : chr "N576AA" "N557AA" "N541AA" "N403AA" ...
## $ ActualElapsedTime: int 60 60 70 70 62 64 70 59 71 70 ...
## $ AirTime : int 40 45 48 39 44 45 43 40 41 45 ...
## $ ArrDelay : int -10 -9 -8 3 -3 -7 -1 -16 44 43 ...
## $ DepDelay : int 0 1 -8 3 5 -1 -1 -5 43 43 ...
## $ Origin : chr "IAH" "IAH" "IAH" "IAH" ...
## $ Dest : chr "DFW" "DFW" "DFW" "DFW" ...
## $ Distance : int 224 224 224 224 224 224 224 224 224 224 ...
## $ TaxiIn : int 7 6 5 9 9 6 12 7 8 6 ...
## $ TaxiOut : int 13 9 17 22 9 13 15 12 22 19 ...
## $ Cancelled : int 0 0 0 0 0 0 0 0 0 0 ...
## $ CancellationCode : chr "" "" "" "" ...
## $ Diverted : int 0 0 0 0 0 0 0 0 0 0 ...
summary(hflights)
## Year Month DayofMonth DayOfWeek
## Min. :2011 Min. : 1.000 Min. : 1.00 Min. :1.000
## 1st Qu.:2011 1st Qu.: 4.000 1st Qu.: 8.00 1st Qu.:2.000
## Median :2011 Median : 7.000 Median :16.00 Median :4.000
## Mean :2011 Mean : 6.514 Mean :15.74 Mean :3.948
## 3rd Qu.:2011 3rd Qu.: 9.000 3rd Qu.:23.00 3rd Qu.:6.000
## Max. :2011 Max. :12.000 Max. :31.00 Max. :7.000
##
## DepTime ArrTime UniqueCarrier FlightNum
## Min. : 1 Min. : 1 Length:227496 Min. : 1
## 1st Qu.:1021 1st Qu.:1215 Class :character 1st Qu.: 855
## Median :1416 Median :1617 Mode :character Median :1696
## Mean :1396 Mean :1578 Mean :1962
## 3rd Qu.:1801 3rd Qu.:1953 3rd Qu.:2755
## Max. :2400 Max. :2400 Max. :7290
## NA's :2905 NA's :3066
## TailNum ActualElapsedTime AirTime ArrDelay
## Length:227496 Min. : 34.0 Min. : 11.0 Min. :-70.000
## Class :character 1st Qu.: 77.0 1st Qu.: 58.0 1st Qu.: -8.000
## Mode :character Median :128.0 Median :107.0 Median : 0.000
## Mean :129.3 Mean :108.1 Mean : 7.094
## 3rd Qu.:165.0 3rd Qu.:141.0 3rd Qu.: 11.000
## Max. :575.0 Max. :549.0 Max. :978.000
## NA's :3622 NA's :3622 NA's :3622
## DepDelay Origin Dest Distance
## Min. :-33.000 Length:227496 Length:227496 Min. : 79.0
## 1st Qu.: -3.000 Class :character Class :character 1st Qu.: 376.0
## Median : 0.000 Mode :character Mode :character Median : 809.0
## Mean : 9.445 Mean : 787.8
## 3rd Qu.: 9.000 3rd Qu.:1042.0
## Max. :981.000 Max. :3904.0
## NA's :2905
## TaxiIn TaxiOut Cancelled CancellationCode
## Min. : 1.000 Min. : 1.00 Min. :0.00000 Length:227496
## 1st Qu.: 4.000 1st Qu.: 10.00 1st Qu.:0.00000 Class :character
## Median : 5.000 Median : 14.00 Median :0.00000 Mode :character
## Mean : 6.099 Mean : 15.09 Mean :0.01307
## 3rd Qu.: 7.000 3rd Qu.: 18.00 3rd Qu.:0.00000
## Max. :165.000 Max. :163.00 Max. :1.00000
## NA's :3066 NA's :2947
## Diverted
## Min. :0.000000
## 1st Qu.:0.000000
## Median :0.000000
## Mean :0.002853
## 3rd Qu.:0.000000
## Max. :1.000000
##
We see a dataframe of 227,496 observations of the 21 variables described above. The variables can be grouped as follows.
We can see that although we have 21 variables, we really cover only five concepts or ideas. We might also infer that “How Long” the flight took is an important concept, since eight (8) variables have something to do with total time involved in a flight. By the way, although not defined in the dataset write up, I observed that Monday is day 1 of the week through Sunday being day 7.
Based on this initial examination of the dataset we should be able to gain some insight into where people fly to from Houston, how long it takes, and the likelihood of getting there.
To make the dataframe easier to use I want to combine the date-related fields into a date and make the variables in the dataframe factors. I find the summary information on factor data very useful. I also plan to make the binary fields True-False. We will also start the data formatting by making a copy of the hflights dataframe.
I find it more useful most times to have real date fields instead of integers for months, days and years. Here we will combine the Year, Month, DayofMonth variables into Date. We will also make the DayofWeek variable into a legible factor variable. After the conversions we will drop the separate year, month, and day fields. Notice the almost even distribution of number of flights by day of the week from the summary.
hf.new <- hflights
hf.new$Date <- as.Date(paste(hf.new$Year, hf.new$Month, hf.new$DayofMonth, sep="-"))
hf.new$DayOfWeek[hf.new$DayOfWeek == 1] <- "Mon"
hf.new$DayOfWeek[hf.new$DayOfWeek == 2] <- "Tue"
hf.new$DayOfWeek[hf.new$DayOfWeek == 3] <- "Wed"
hf.new$DayOfWeek[hf.new$DayOfWeek == 4] <- "Thu"
hf.new$DayOfWeek[hf.new$DayOfWeek == 5] <- "Fri"
hf.new$DayOfWeek[hf.new$DayOfWeek == 6] <- "Sat"
hf.new$DayOfWeek[hf.new$DayOfWeek == 7] <- "Sun"
hf.new$DayOfWeek <- as.factor(hf.new$DayOfWeek)
hf.new <- subset(hf.new, select = c(-Year, -Month, -DayofMonth))
summary(hf.new)
## DayOfWeek DepTime ArrTime UniqueCarrier
## Fri:34972 Min. : 1 Min. : 1 Length:227496
## Mon:34360 1st Qu.:1021 1st Qu.:1215 Class :character
## Sat:27629 Median :1416 Median :1617 Mode :character
## Sun:32058 Mean :1396 Mean :1578
## Thu:34902 3rd Qu.:1801 3rd Qu.:1953
## Tue:31649 Max. :2400 Max. :2400
## Wed:31926 NA's :2905 NA's :3066
## FlightNum TailNum ActualElapsedTime AirTime
## Min. : 1 Length:227496 Min. : 34.0 Min. : 11.0
## 1st Qu.: 855 Class :character 1st Qu.: 77.0 1st Qu.: 58.0
## Median :1696 Mode :character Median :128.0 Median :107.0
## Mean :1962 Mean :129.3 Mean :108.1
## 3rd Qu.:2755 3rd Qu.:165.0 3rd Qu.:141.0
## Max. :7290 Max. :575.0 Max. :549.0
## NA's :3622 NA's :3622
## ArrDelay DepDelay Origin Dest
## Min. :-70.000 Min. :-33.000 Length:227496 Length:227496
## 1st Qu.: -8.000 1st Qu.: -3.000 Class :character Class :character
## Median : 0.000 Median : 0.000 Mode :character Mode :character
## Mean : 7.094 Mean : 9.445
## 3rd Qu.: 11.000 3rd Qu.: 9.000
## Max. :978.000 Max. :981.000
## NA's :3622 NA's :2905
## Distance TaxiIn TaxiOut Cancelled
## Min. : 79.0 Min. : 1.000 Min. : 1.00 Min. :0.00000
## 1st Qu.: 376.0 1st Qu.: 4.000 1st Qu.: 10.00 1st Qu.:0.00000
## Median : 809.0 Median : 5.000 Median : 14.00 Median :0.00000
## Mean : 787.8 Mean : 6.099 Mean : 15.09 Mean :0.01307
## 3rd Qu.:1042.0 3rd Qu.: 7.000 3rd Qu.: 18.00 3rd Qu.:0.00000
## Max. :3904.0 Max. :165.000 Max. :163.00 Max. :1.00000
## NA's :3066 NA's :2947
## CancellationCode Diverted Date
## Length:227496 Min. :0.000000 Min. :2011-01-01
## Class :character 1st Qu.:0.000000 1st Qu.:2011-04-03
## Mode :character Median :0.000000 Median :2011-07-02
## Mean :0.002853 Mean :2011-07-01
## 3rd Qu.:0.000000 3rd Qu.:2011-09-29
## Max. :1.000000 Max. :2011-12-31
##
These variables were all numeric and it makes sense to leave them in that form. These are the only variables where calculations may be needed.
It makes sense to turn these variables into factors in the dataframe. This allows for simple counts just running the summary function. The Distance variable grouped with location is a numeric value and will stay that way. Notice the number of carriers, planes (by tail num), the only two origins, and number of destinations.
hf.new$UniqueCarrier <- as.factor(hf.new$UniqueCarrier)
hf.new$FlightNum <- as.factor(hf.new$FlightNum)
hf.new$TailNum <- as.factor(hf.new$TailNum)
hf.new$Origin <- as.factor(hf.new$Origin)
hf.new$Dest <- as.factor(hf.new$Dest)
summary(hf.new)
## DayOfWeek DepTime ArrTime UniqueCarrier
## Fri:34972 Min. : 1 Min. : 1 XE :73053
## Mon:34360 1st Qu.:1021 1st Qu.:1215 CO :70032
## Sat:27629 Median :1416 Median :1617 WN :45343
## Sun:32058 Mean :1396 Mean :1578 OO :16061
## Thu:34902 3rd Qu.:1801 3rd Qu.:1953 MQ : 4648
## Tue:31649 Max. :2400 Max. :2400 US : 4082
## Wed:31926 NA's :2905 NA's :3066 (Other):14277
## FlightNum TailNum ActualElapsedTime AirTime
## 52 : 667 N14945 : 971 Min. : 34.0 Min. : 11.0
## 8 : 636 N15926 : 960 1st Qu.: 77.0 1st Qu.: 58.0
## 1590 : 634 N16927 : 951 Median :128.0 Median :107.0
## 35 : 618 N12946 : 948 Mean :129.3 Mean :108.1
## 1 : 606 N14937 : 946 3rd Qu.:165.0 3rd Qu.:141.0
## 60 : 600 N14942 : 946 Max. :575.0 Max. :549.0
## (Other):223735 (Other):221774 NA's :3622 NA's :3622
## ArrDelay DepDelay Origin Dest
## Min. :-70.000 Min. :-33.000 HOU: 52299 DAL : 9820
## 1st Qu.: -8.000 1st Qu.: -3.000 IAH:175197 ATL : 7886
## Median : 0.000 Median : 0.000 MSY : 6823
## Mean : 7.094 Mean : 9.445 DFW : 6653
## 3rd Qu.: 11.000 3rd Qu.: 9.000 LAX : 6064
## Max. :978.000 Max. :981.000 DEN : 5920
## NA's :3622 NA's :2905 (Other):184330
## Distance TaxiIn TaxiOut Cancelled
## Min. : 79.0 Min. : 1.000 Min. : 1.00 Min. :0.00000
## 1st Qu.: 376.0 1st Qu.: 4.000 1st Qu.: 10.00 1st Qu.:0.00000
## Median : 809.0 Median : 5.000 Median : 14.00 Median :0.00000
## Mean : 787.8 Mean : 6.099 Mean : 15.09 Mean :0.01307
## 3rd Qu.:1042.0 3rd Qu.: 7.000 3rd Qu.: 18.00 3rd Qu.:0.00000
## Max. :3904.0 Max. :165.000 Max. :163.00 Max. :1.00000
## NA's :3066 NA's :2947
## CancellationCode Diverted Date
## Length:227496 Min. :0.000000 Min. :2011-01-01
## Class :character 1st Qu.:0.000000 1st Qu.:2011-04-03
## Mode :character Median :0.000000 Median :2011-07-02
## Mean :0.002853 Mean :2011-07-01
## 3rd Qu.:0.000000 3rd Qu.:2011-09-29
## Max. :1.000000 Max. :2011-12-31
##
We have two binary variables for yes-no or true-false for “Flight Cancelled” and “Flight Diverted”. I changed them to True-False fields for simplicity. The cancellation code field was turned into a factor for ease of counting the types. The many blank values were made into NA’s. Notice that we now know there were 2,973 canceled flights, 649 diverted flights, and that the cancellation codes counts add up to match the number of canceled flights.
hf.new[["Cancelled"]] <- hf.new$Cancelled == 1
hf.new[["Diverted"]] <- hf.new$Diverted == 1
hf.new$CancellationCode[hf.new$CancellationCode == ""] <- NA
hf.new$CancellationCode <- as.factor(hf.new$CancellationCode)
summary(hf.new)
## DayOfWeek DepTime ArrTime UniqueCarrier
## Fri:34972 Min. : 1 Min. : 1 XE :73053
## Mon:34360 1st Qu.:1021 1st Qu.:1215 CO :70032
## Sat:27629 Median :1416 Median :1617 WN :45343
## Sun:32058 Mean :1396 Mean :1578 OO :16061
## Thu:34902 3rd Qu.:1801 3rd Qu.:1953 MQ : 4648
## Tue:31649 Max. :2400 Max. :2400 US : 4082
## Wed:31926 NA's :2905 NA's :3066 (Other):14277
## FlightNum TailNum ActualElapsedTime AirTime
## 52 : 667 N14945 : 971 Min. : 34.0 Min. : 11.0
## 8 : 636 N15926 : 960 1st Qu.: 77.0 1st Qu.: 58.0
## 1590 : 634 N16927 : 951 Median :128.0 Median :107.0
## 35 : 618 N12946 : 948 Mean :129.3 Mean :108.1
## 1 : 606 N14937 : 946 3rd Qu.:165.0 3rd Qu.:141.0
## 60 : 600 N14942 : 946 Max. :575.0 Max. :549.0
## (Other):223735 (Other):221774 NA's :3622 NA's :3622
## ArrDelay DepDelay Origin Dest
## Min. :-70.000 Min. :-33.000 HOU: 52299 DAL : 9820
## 1st Qu.: -8.000 1st Qu.: -3.000 IAH:175197 ATL : 7886
## Median : 0.000 Median : 0.000 MSY : 6823
## Mean : 7.094 Mean : 9.445 DFW : 6653
## 3rd Qu.: 11.000 3rd Qu.: 9.000 LAX : 6064
## Max. :978.000 Max. :981.000 DEN : 5920
## NA's :3622 NA's :2905 (Other):184330
## Distance TaxiIn TaxiOut Cancelled
## Min. : 79.0 Min. : 1.000 Min. : 1.00 Mode :logical
## 1st Qu.: 376.0 1st Qu.: 4.000 1st Qu.: 10.00 FALSE:224523
## Median : 809.0 Median : 5.000 Median : 14.00 TRUE :2973
## Mean : 787.8 Mean : 6.099 Mean : 15.09 NA's :0
## 3rd Qu.:1042.0 3rd Qu.: 7.000 3rd Qu.: 18.00
## Max. :3904.0 Max. :165.000 Max. :163.00
## NA's :3066 NA's :2947
## CancellationCode Diverted Date
## A : 1202 Mode :logical Min. :2011-01-01
## B : 1652 FALSE:226847 1st Qu.:2011-04-03
## C : 118 TRUE :649 Median :2011-07-02
## D : 1 NA's :0 Mean :2011-07-01
## NA's:224523 3rd Qu.:2011-09-29
## Max. :2011-12-31
##
Let’s take a closer look at the data and do some data visulization in the order we have following.
We can use histograms to look at the number of flights by different time periods, such as by month throughout the year and by day of the week. I found that by coded the days of week with thw first three letters makes it easier to read, I will need to figure out how to set the order. Alphabetical seems to be the default and makes Friday comes first, followd by Monday.
Notice the summer months are peak flight times for the Houston airports with August hitting 20,000 flights, July slightly under that followed by a little lower June. Fall looks flat and steady as close to 18,000 flights a month. January is the lowest with just over 15,000 flights.
Days of the week are interesting too, with Saturdays being the least likely day to fly and Friday being the most likely. The really is not much variation by day of the week. With all flights for the year divided ou by day, Saturday is just under 30,000 and the other day are all between 30,000 and 35,000.
library(ggplot2)
ggplot(hf.new, aes(x = hf.new$Date)) + geom_histogram(binwidth = 30, fill="red", color="black")
ggplot(hf.new, aes(x = hf.new$DayOfWeek)) + geom_histogram(fill="orange", color="black")
We can quickly look at when you are most likely to have a flight delayed by looking at the departure delay over time. You can see that we may need to spread this out, maybe look at individual months to get real detail. However, this overview does tell us some important information. That thick band of black data points at the bottom at the first box line means 125 minutes of delay. It looks like a 2 hour delay is very likely all through the year. The very long delays (over 500 minutes) are infrequent, but spread pretty evenly across the year.
ggplot(hf.new, aes(x= Date, y = DepDelay)) + geom_line(color = "Green") + geom_point()
## Warning: Removed 2905 rows containing missing values (geom_point).
I am out of time now, so I will end here. My last look at this dataset will be to use my favorite function again a new parameter; summary(hf.new, 20). This will take the element counts of the factor variables to 20 elements. From this we can see that there were 15 different airlines flying out of Houston in 2011; that the top 20 planes flew around 900 flights in 2011; that IAH had over 175,000 flights originate from there and HOU had just over 52,000; and the top 20 destination airports with Dallas leading the list at 9,820 flights in 2011.
summary(hf.new, 20)
## DayOfWeek DepTime ArrTime UniqueCarrier FlightNum
## Fri:34972 Min. : 1 Min. : 1 AA: 3244 52 : 667
## Mon:34360 1st Qu.:1021 1st Qu.:1215 AS: 365 8 : 636
## Sat:27629 Median :1416 Median :1617 B6: 695 1590 : 634
## Sun:32058 Mean :1396 Mean :1578 CO:70032 35 : 618
## Thu:34902 3rd Qu.:1801 3rd Qu.:1953 DL: 2641 1 : 606
## Tue:31649 Max. :2400 Max. :2400 EV: 2204 60 : 600
## Wed:31926 NA's :2905 NA's :3066 F9: 838 47 : 574
## FL: 2139 6 : 567
## MQ: 4648 5 : 537
## OO:16061 33 : 523
## UA: 2072 1294 : 411
## US: 4082 731 : 405
## WN:45343 286 : 387
## XE:73053 270 : 382
## YV: 79 106 : 371
## 3216 : 370
## 62 : 366
## 89 : 365
## 1586 : 365
## (Other):218112
## TailNum ActualElapsedTime AirTime ArrDelay
## N14945 : 971 Min. : 34.0 Min. : 11.0 Min. :-70.000
## N15926 : 960 1st Qu.: 77.0 1st Qu.: 58.0 1st Qu.: -8.000
## N16927 : 951 Median :128.0 Median :107.0 Median : 0.000
## N12946 : 948 Mean :129.3 Mean :108.1 Mean : 7.094
## N14937 : 946 3rd Qu.:165.0 3rd Qu.:141.0 3rd Qu.: 11.000
## N14942 : 946 Max. :575.0 Max. :549.0 Max. :978.000
## N15948 : 942 NA's :3622 NA's :3622 NA's :3622
## N14938 : 935
## N13935 : 934
## N14943 : 934
## N14947 : 921
## N15932 : 920
## N13936 : 915
## N14930 : 913
## N15941 : 911
## N16944 : 909
## N14939 : 902
## N14933 : 897
## N12934 : 889
## (Other):209852
## DepDelay Origin Dest Distance
## Min. :-33.000 HOU: 52299 DAL : 9820 Min. : 79.0
## 1st Qu.: -3.000 IAH:175197 ATL : 7886 1st Qu.: 376.0
## Median : 0.000 MSY : 6823 Median : 809.0
## Mean : 9.445 DFW : 6653 Mean : 787.8
## 3rd Qu.: 9.000 LAX : 6064 3rd Qu.:1042.0
## Max. :981.000 DEN : 5920 Max. :3904.0
## NA's :2905 ORD : 5748
## PHX : 5096
## AUS : 5022
## SAT : 4893
## CRP : 4813
## CLT : 4735
## EWR : 4314
## LAS : 4082
## HRL : 3983
## MCO : 3687
## BNA : 3481
## MCI : 3174
## OKC : 3170
## (Other):128132
## TaxiIn TaxiOut Cancelled CancellationCode
## Min. : 1.000 Min. : 1.00 Mode :logical A : 1202
## 1st Qu.: 4.000 1st Qu.: 10.00 FALSE:224523 B : 1652
## Median : 5.000 Median : 14.00 TRUE :2973 C : 118
## Mean : 6.099 Mean : 15.09 NA's :0 D : 1
## 3rd Qu.: 7.000 3rd Qu.: 18.00 NA's:224523
## Max. :165.000 Max. :163.00
## NA's :3066 NA's :2947
##
##
##
##
##
##
##
##
##
##
##
##
##
## Diverted Date
## Mode :logical Min. :2011-01-01
## FALSE:226847 1st Qu.:2011-04-03
## TRUE :649 Median :2011-07-02
## NA's :0 Mean :2011-07-01
## 3rd Qu.:2011-09-29
## Max. :2011-12-31
##
##
##
##
##
##
##
##
##
##
##
##
##
##