This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see http://rmarkdown.rstudio.com.
When you click the Knit button a document will be generated that includes both content as well as the output of any embedded R code chunks within the document. You can embed an R code chunk like this:
You can also embed plots, for example:
Note that the echo = FALSE parameter was added to the code chunk to prevent printing of the R code that generated the plot.
Let’s begin by looking at a dataset called Births78. This is a data frame containing the number of births in the United States for each day in 1978.
We can use the head() function to look at the first several rows of this dataset:
head(Births78)
## date births dayofyear
## 1 1978-01-01 7701 1
## 2 1978-01-02 7527 2
## 3 1978-01-03 8825 3
## 4 1978-01-04 8859 4
## 5 1978-01-05 9043 5
## 6 1978-01-06 9208 6
Now you’ve seen your first example of a code chunk (with R code) and its corresponding output.
Births78? Label each as categorical or quantitative.SOLUTION:
head(Births78, n=4)
## date births dayofyear
## 1 1978-01-01 7701 1
## 2 1978-01-02 7527 2
## 3 1978-01-03 8825 3
## 4 1978-01-04 8859 4
We can also look at a histogram of the births per day.
histogram(~births, data=Births78)
SOLUTION: The center of the histogram is around 9000 births.The graph is unimodal, and a bit skewed to the left, but the values on the right side are generally higher than those on the left, so it may be closer to symmetric than it appears.
Finally, consider a time plot of the number of births over the entire year, ordered by day.
xyplot(births ~ dayofyear, data=Births78)
head(Births78, n=100)
## date births dayofyear
## 1 1978-01-01 7701 1
## 2 1978-01-02 7527 2
## 3 1978-01-03 8825 3
## 4 1978-01-04 8859 4
## 5 1978-01-05 9043 5
## 6 1978-01-06 9208 6
## 7 1978-01-07 8084 7
## 8 1978-01-08 7611 8
## 9 1978-01-09 9172 9
## 10 1978-01-10 9089 10
## 11 1978-01-11 9210 11
## 12 1978-01-12 9259 12
## 13 1978-01-13 9138 13
## 14 1978-01-14 8299 14
## 15 1978-01-15 7771 15
## 16 1978-01-16 9458 16
## 17 1978-01-17 9339 17
## 18 1978-01-18 9120 18
## 19 1978-01-19 9226 19
## 20 1978-01-20 9305 20
## 21 1978-01-21 7954 21
## 22 1978-01-22 7560 22
## 23 1978-01-23 9252 23
## 24 1978-01-24 9416 24
## 25 1978-01-25 9090 25
## 26 1978-01-26 9387 26
## 27 1978-01-27 8983 27
## 28 1978-01-28 7946 28
## 29 1978-01-29 7527 29
## 30 1978-01-30 9184 30
## 31 1978-01-31 9152 31
## 32 1978-02-01 9159 32
## 33 1978-02-02 9218 33
## 34 1978-02-03 9167 34
## 35 1978-02-04 8065 35
## 36 1978-02-05 7804 36
## 37 1978-02-06 9225 37
## 38 1978-02-07 9328 38
## 39 1978-02-08 9139 39
## 40 1978-02-09 9247 40
## 41 1978-02-10 9527 41
## 42 1978-02-11 8144 42
## 43 1978-02-12 7950 43
## 44 1978-02-13 8966 44
## 45 1978-02-14 9859 45
## 46 1978-02-15 9285 46
## 47 1978-02-16 9103 47
## 48 1978-02-17 9238 48
## 49 1978-02-18 8167 49
## 50 1978-02-19 7695 50
## 51 1978-02-20 9021 51
## 52 1978-02-21 9252 52
## 53 1978-02-22 9335 53
## 54 1978-02-23 9268 54
## 55 1978-02-24 9552 55
## 56 1978-02-25 8313 56
## 57 1978-02-26 7881 57
## 58 1978-02-27 9262 58
## 59 1978-02-28 9705 59
## 60 1978-03-01 9132 60
## 61 1978-03-02 9304 61
## 62 1978-03-03 9431 62
## 63 1978-03-04 8008 63
## 64 1978-03-05 7791 64
## 65 1978-03-06 9294 65
## 66 1978-03-07 9573 66
## 67 1978-03-08 9212 67
## 68 1978-03-09 9218 68
## 69 1978-03-10 9583 69
## 70 1978-03-11 8144 70
## 71 1978-03-12 7870 71
## 72 1978-03-13 9022 72
## 73 1978-03-14 9525 73
## 74 1978-03-15 9284 74
## 75 1978-03-16 9327 75
## 76 1978-03-17 9480 76
## 77 1978-03-18 7965 77
## 78 1978-03-19 7729 78
## 79 1978-03-20 9135 79
## 80 1978-03-21 9663 80
## 81 1978-03-22 9307 81
## 82 1978-03-23 9159 82
## 83 1978-03-24 9157 83
## 84 1978-03-25 7874 84
## 85 1978-03-26 7589 85
## 86 1978-03-27 9100 86
## 87 1978-03-28 9293 87
## 88 1978-03-29 9195 88
## 89 1978-03-30 8902 89
## 90 1978-03-31 9318 90
## 91 1978-04-01 8069 91
## 92 1978-04-02 7691 92
## 93 1978-04-03 9114 93
## 94 1978-04-04 9439 94
## 95 1978-04-05 8852 95
## 96 1978-04-06 8969 96
## 97 1978-04-07 9077 97
## 98 1978-04-08 7890 98
## 99 1978-04-09 7445 99
## 100 1978-04-10 8870 100
SOLUTION: (optional) The shapes of the groups match, meaning that if there is an increase in births during one time of the year, that increase persists into the other group of data. It’s more likely then, that something else is “consistantly” effecting the consistancy of the data; for instance the true number of births varied a lot less than this graph shows, so perhaps we got 1000 more or less samples on one day than we would a previous day.