ANTH 504 HW #1

library(dplyr)

## 
## Attaching package: 'dplyr'

## The following objects are masked from 'package:stats':
## 
##     filter, lag

## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union

library(stats)

The following dataset represents the amount of time (in seconds) that it takes 21 heavy smokers to fall off a treadmill at the fastest setting:

smoker_fall_time_s <- c(18, 16, 18, 24, 23, 22, 22, 23, 26, 29, 32,
34, 34, 36, 36, 43, 42, 49, 46, 46, 57)
smoker_fall_time_s

##  [1] 18 16 18 24 23 22 22 23 26 29 32 34 34 36 36 43 42 49 46 46 57

Calculate

a) The mean

#a)
mean(smoker_fall_time_s)

## [1] 32.19048

b) The median

#b)
median(smoker_fall_time_s)

## [1] 32

c) The range

#c)
range(smoker_fall_time_s)

## [1] 16 57

57-16

## [1] 41

The range is 41.

d) Standard deviation

#d)
sd(smoker_fall_time_s)

## [1] 11.58714

e) Variance

#e)
var(smoker_fall_time_s)

## [1] 134.2619

why not)

#why not
summary(smoker_fall_time_s)

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   16.00   23.00   32.00   32.19   42.00   57.00

A frequency distribution in which there is a cluster of frequent scores at the low (more negative)end would be described as what?

Skew

The positive skew has a cluster of scores on the low end.

If the scores on a test have a mean of 26 with a standard deviation of 4,

mean <- 26
sd <- 4

a) what is the z-score of a score of 18?

First I will draw it.

The z-score should be in the left tail, a negative value.

xi_a <- 18
z_score_fun <- function (xi)
  (xi - mean) / sd
z_score_a <- z_score_fun(xi_a)
z_score_a

## [1] -2

The z-score is -2.

b) what is the probability that a student scores less than 18 on an exam?

below_18 <- pnorm(z_score_a)
below_18

## [1] 0.02275013

pnorm(xi_a, mean, sd) * 100

## [1] 2.275013

There is a 2.28% chance that a student scored less than 18 on an exam.

c) what is the probability that a student scores between 18 and 28 points on the test?

First I will calculate the z_score for 28. With the mean at 26, the z-score should be positive and the value of data to the left should be larger than 50% because 28 is larger than the mean of 26.

xi_b <- 28
z_score_b <- z_score_fun(xi_b)
z_score_b

## [1] 0.5

below_28 <- pnorm(z_score_b)
below_28

## [1] 0.6914625

The z_score for 28 is 0.5 and 69.1% of the data is below 28.

Next I need to determine the amount of data above 18

above_18 <- 1 - below_18
above_18

## [1] 0.9772499

97.7% of the data is above 18.

At last, I am ready to determine the percentage of scores between 18 and 28.

between_18_28 <- above_18 - below_28
between_18_28 * 100

## [1] 28.57874

There is a 28.6% chance that a student scored between 18 and 28 on the test.

Using the murders dataframe (from the dslabs package), list the variables in this dataset and describe each variable as a nominal, ordinal, interval or ratio variable.

library(dslabs)
data(murders)
head(murders)

##        state abb region population total
## 1    Alabama  AL  South    4779736   135
## 2     Alaska  AK   West     710231    19
## 3    Arizona  AZ   West    6392017   232
## 4   Arkansas  AR  South    2915918    93
## 5 California  CA   West   37253956  1257
## 6   Colorado  CO   West    5029196    65

ANTH 504 HW #1

Lorraine Gaudio

2023-01-18