Basic probability

The probababilty of perticuar event can be described as:

\[ \text{Probabillity} = \frac{\text{Number of favourable outcome}}{\text{Number of possible eaqually like outcomes}} \]

Lets assume simple coin. It has a two faces and if you toss the coin it will give result either HEAD or TAIL

The probability of the coin landing H or HEAD is $P(H)$

And the probability of the coin landing T or TAIL is $P(T)$

Given that all outcomes are equally likely, we can compute the probability of a landing HEAD using the formula:

\[P(H) = \frac{\text{Number of favourable outcome}}{\text{Number of possible eaqually like outcomes}} = \frac{1}{2} \]

Similarly probability of landing TAIL is $\frac{1}{2}$.

Probability is Just a Guide.

Probability does not confirm us exactly what will happen, it is just a guide.

Example: toss a coin 10 times, how many Heads will come up? According to the Probability of the coin landing Head is ½ chance, so we can expect 5 Heads. But when we actually try it we might get 3 heads, or 4 heads … or anything really.

Lets simulate this using some R script

we create the coin & assign “H” & “T” (for HEAD & TAIL)

coin <- c("H","T")

toss the coin 10 times and store the results in result

result <- sample(coin,10, replace = TRUE)

Lets see the result

result

##  [1] "T" "T" "T" "T" "T" "T" "H" "T" "H" "H"

We can show as a table

table(result)

## result
## H T 
## 3 7

Lets convert the table to data frame.

df_total <- data.frame(table(result))

Lets see the df_total

df_total

with above results we can see HEAD 3 times and TAIL df_total$Freq[df_total$result == "T"] times.

Using prop.table() we can add additional columns to show probability of each “H” & “T”

df_prob <- cbind(df_total, prop.table(df_total$Freq))
names(df_prob) <- c("result","Freq", "probability")
df_prob

To show the results graphically we can generate bar plot using barplot in base R

barplot(df_prob$Freq, names.arg = df_total$result, main = "Head & Tail count of Tossing a coin 10 times")

barplot(df_prob$probability, names.arg = df_total$result, main = "Probability of Tossing a coin 10 times")

What is the conclusion?

seems number of HEAD or TAIL frequencies are equal/ not equal?

Probability of HEAD $P(H)$

Probability of TAIL $P(T)$

In here $P(H) = P(T) = \frac{1}{2}$ may be/may not be observed.

Probability is Just a Guide Probability does not confirm us exactly what will happen

toss Coin function

To demonstrate that we can create tossCoin function & plotCoin function to plot the results nicely.

We can use ggplot2 package to create plots and gridExtra package to arrange the plots.

library(ggplot2)
library(gridExtra)

Lets write the two functions.

tossCoin <- function(numToss = 10){
  coin <- c("H","T")
  result <- sample(coin,numToss, replace = TRUE)
  df_total <- data.frame(table(result))
  

}

plotCoin <- function(df){
  number <- sum(df$Freq)
  
  
  plot1 <- ggplot(df, aes(result,Freq, fill = result)) + geom_bar(stat = "identity") + labs(title = paste0("Tossing a coin ", number," times")) +theme(legend.position = "none")
  plot2 <- ggplot(df, aes(result, prop.table(df$Freq), fill = result)) + geom_bar(stat = "identity") + labs(title = paste0("Measured probability of Tossing a coin ", number," times"), y = "Probability") + theme(legend.position = "none")
  
  grid.arrange(plot1, plot2)
  
}

Lets see the results by tossing coin 50 times

df50 <- tossCoin(50)
df50

plotCoin(df50)

May be 1000 times

df1000 <- tossCoin(1000)
df1000

plotCoin(df1000)

So if we toss the coin large number of times, the probability of each occurance is approximately/equal to the theoritical probability.

Lets try some big number…

df1million <- tossCoin(1000000)
df1million

plotCoin(df1million)

***

Can you think of similar procedure for toss a Dice.

In here instead of two outcomes we have 6 outcomes.

toss Dice function

To demonstrate that we can create tossDice function & plotDice function similar to toss Coin.

tossDice <- function(numToss = 10){
  dice <- c(1:6)
  result <- sample(dice, numToss, replace = TRUE)
  df_total <- data.frame(table(result))
  
}

plotDice <- function(df){
  number <- sum(df$Freq)
  
 
  plot1 <- ggplot(df, aes(result,Freq, fill = result)) + geom_bar(stat = "identity") + labs(title = paste0("Tossing a dice ", number," times")) +theme(legend.position = "none")


  plot2 <- ggplot(df, aes(result, prop.table(df$Freq), fill = result)) + geom_bar(stat = "identity") + labs(title = paste0("Probability of Tossing a dice ", number," times"), y = "Probability") + theme(legend.position = "none")
  
  grid.arrange(plot1, plot2)

}

Lets see the results by tossing coin 50 times

df50 <- tossDice(50)
df50

plotDice(df50)

May be 1000 times

df1000 <- tossDice(1000)
df1000

plotDice(df1000)

Dice can be tossed 1 million times

df1million <- tossDice(1000000)
df1million

plotDice(df1million)

as a proportional table

cbind(df1million, prop.table(df1million$Freq))

We can see if the number of times tossing dice increases the probablity of each occurance is approximating to $\frac{1}{6} = 0.1666667$

Simple Probability

Sudarshana A

Introduction

Basic probability

toss Coin function

Can you think of similar procedure for toss a Dice.

toss Dice function