HW3_DAT301

A confidence interval can be calculated using z-scores or t-scores. T scores are used when the population’s standard deviation is unknown, while Z-scores are used when the sd is known. For both cases the data has to follow a normal distribution, which can be identified as data the shape of which reminds of a “bell curve”, and is not skewed to the right ot left.

For this presentation I will use the Iris data set which is built-in in the RStudio.

data("iris")
head(iris)

  Sepal.Length Sepal.Width Petal.Length Petal.Width Species
1          5.1         3.5          1.4         0.2  setosa
2          4.9         3.0          1.4         0.2  setosa
3          4.7         3.2          1.3         0.2  setosa
4          4.6         3.1          1.5         0.2  setosa
5          5.0         3.6          1.4         0.2  setosa
6          5.4         3.9          1.7         0.4  setosa

library(tidyverse)

── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.1.4     ✔ readr     2.1.4
✔ forcats   1.0.0     ✔ stringr   1.5.0
✔ ggplot2   3.5.1     ✔ tibble    3.2.1
✔ lubridate 1.9.3     ✔ tidyr     1.3.1
✔ purrr     1.0.2     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors

library(ggplot2)
library(dplyr)
library(plotly)

Attaching package: 'plotly'

The following object is masked from 'package:ggplot2':

    last_plot

The following object is masked from 'package:stats':

    filter

The following object is masked from 'package:graphics':

    layout

Petal.Length Species 1 1.4 setosa 2 1.4 setosa 3 1.3 setosa 4 1.5 setosa 5 1.4 setosa 6 1.7 setosa 7 1.4 setosa 8 1.5 setosa 9 1.4 setosa 10 1.5 setosa 11 1.5 setosa 12 1.6 setosa 13 1.4 setosa 14 1.1 setosa 15 1.2 setosa 16 1.5 setosa 17 1.3 setosa 18 1.4 setosa 19 1.7 setosa 20 1.5 setosa 21 1.7 setosa 22 1.5 setosa 23 1.0 setosa 24 1.7 setosa 25 1.9 setosa 26 1.6 setosa 27 1.6 setosa 28 1.5 setosa 29 1.4 setosa 30 1.6 setosa 31 1.6 setosa 32 1.5 setosa 33 1.5 setosa 34 1.4 setosa 35 1.5 setosa 36 1.2 setosa 37 1.3 setosa 38 1.4 setosa 39 1.3 setosa 40 1.5 setosa 41 1.3 setosa 42 1.3 setosa 43 1.3 setosa 44 1.6 setosa 45 1.9 setosa 46 1.4 setosa 47 1.6 setosa 48 1.4 setosa 49 1.5 setosa 50 1.4 setosa

##used the Zscore of 1.96 as it is the standard Zscore value for a 95% confidence interval marginOfError_95CI <- 1.96*(se_SetosaPL) # Calculating the lower bound of the conf interval lower_bound <- meanPetalLength_Setosa - marginOfError_95CI

SetLvsW <- plot_ly(x = setosaLenghts$Petal.Length, y = setosaPetalWidth$Petal.Width, type = 'scatter', mode = 'markers') %>% layout( xaxis = list(title = 'Petal Length'), yaxis = list(title = 'Petal Width'),title = 'Setosa Iris Petal Lengths vs Petal Width') SetLvsW <- SetLvsW %>% add_segments(x=lower_bound, xend=lower_bound, y=min(setosaPetalWidth$Petal.Width), yend=max(setosaPetalWidth$Petal.Width), line = list(color = "blue", width = 3, dash = 'dash'), name="Lower Bound 95%CI") SetLvsW <- SetLvsW %>% add_segments(x=upper_bound, xend=upper_bound,y=min(setosaPetalWidth$Petal.Width), yend =max(setosaPetalWidth$Petal.Width), line = list(color = "green", width = 3, dash = 'dash'),name="Upper Bound 95%CI") SetLvsW <- SetLvsW %>% add_segments(x=meanPetalLength_Setosa, xend=meanPetalLength_Setosa, y=min(setosaPetalWidth$Petal.Width), yend =max(setosaPetalWidth$Petal.Width), line = list(color = "red", width = 3),name="Sample Length Mean") SetLvsW

Slide 2: Confidence Intervals

Slide 3: Confidence Intervals with Z-Scores

Slide 4 : Confidence Intervals with Iris Data Set

Plot

Slide 5: Confidence Intervals with Iris Data Set

Plot

Slide 6: Filtering the Data from Iris Data Set

Slide 7: Confidence Interval Calculation

Slide 8: Mean and Standard Error Calculation

Slide 8: Mean and Standard Error Calculation (Cont.)

Slide 9: Margin of Error Calculation

Slide 9: Margin of Error Calculation (Cont.)

Slide 10: Setosa Iris Petal Length Distribution vs its Width

Slide 10: Code for the Plotly Plot