As usual, we need our beloved ggplot2 for this activity
#install.packages("ggplot2") -- remove # if you need to install
library(ggplot2)
This activity will combine some of your new skills from your work in Data Camp www.datacamp.com with our developing knowledge of sampling distributions.
Just for practice …
#Construct a matrix with 4 rows that contain the numbers 1 up to 16
practice <- matrix(1:16,byrow=TRUE,nrow=4)
#Print your matrix
practice
[,1] [,2] [,3] [,4]
[1,] 1 2 3 4
[2,] 5 6 7 8
[3,] 9 10 11 12
[4,] 13 14 15 16
In your class activity, you will figure out all of the possible sums of the faces when you roll two dice. Once you have figured this out, you can create a matrix to keep track.
row1<-c(2,3,4,5,6,7)
row2<-c(3,4,5,6,7,8)
row3<-c(4,5,6,7,8,9)
row4<-c(5,6,7,8,9,10)
row5<-c(6,7,8,9,10,11)
row6<-c(7,8,9,10,11,12)
Put the results into a matrix
#Matrix
dice<-matrix(c(row1,row2,row3,row4,row5,row6),nrow=6,byrow=TRUE)
#Print out the matrix
dice
[,1] [,2] [,3] [,4] [,5] [,6]
[1,] 2 3 4 5 6 7
[2,] 3 4 5 6 7 8
[3,] 4 5 6 7 8 9
[4,] 5 6 7 8 9 10
[5,] 6 7 8 9 10 11
[6,] 7 8 9 10 11 12
In your class activity, you should have figured out the following:
Can you figure out the probabilities of all the possible sums? 1/36,2/36,3/36,4/36,5/36,6/36,5/36,4/36,3/36,2/36,1/36
#Create a vector for the possible sums
sum<-c(2,3,4,5,6,7,8,9,10,11,12)
sum
[1] 2 3 4 5 6 7 8 9 10 11 12
#Create another vector for the probabilities of these sums
prob<-c(1/36,2/36,3/36,4/36,5/36,6/36,5/36,4/36,3/36,2/36,1/36)
prob
[1] 0.02777778 0.05555556 0.08333333 0.11111111 0.13888889
[6] 0.16666667 0.13888889 0.11111111 0.08333333 0.05555556
[11] 0.02777778
#Combine the sum and prob vector using cbind(___, ___), called prob.dist
prob.dist<-cbind(sum,prob)
prob.dist
sum prob
[1,] 2 0.02777778
[2,] 3 0.05555556
[3,] 4 0.08333333
[4,] 5 0.11111111
[5,] 6 0.13888889
[6,] 7 0.16666667
[7,] 8 0.13888889
[8,] 9 0.11111111
[9,] 10 0.08333333
[10,] 11 0.05555556
[11,] 12 0.02777778
#Let's make a bar graph of the probabilities
barplot(prob,names.arg=sum, main="Probabilities for dice sums",xlab="Sum of two Dice",col="Tomato")
The graph above show the theoretical distribution of the sum of two dice. Notice what results we would consider common and what results we would consider rare. With these thoughts in mind, we will take a look at some randomly generated results.
You can try rolling two dice if you have them, or go to https://www.random.org/dice/ and pretend you are rolling two dice. We can also use the following command in R.
#Now let's try rolling two dice 100 times
sums<-c(sample(1:6,100,replace=TRUE)+sample(1:6,100,replace=TRUE))
sums<-data.frame(sums)
Now let’s make a dotplot of the results.
#Make a dotplot
ggplot(sums, aes(x = sums)) + geom_dotplot(binwidth = .2)+
scale_x_continuous(breaks=seq(2,12))
Notice that the simulated results reflect the theoretical distribution, but not exactly. This is what we would expect to see with a random phenomenon.