library(dplyr)
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
library(tidyverse)
## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.1 ──
## ✓ ggplot2 3.3.5 ✓ purrr 0.3.4
## ✓ tibble 3.1.6 ✓ stringr 1.4.0
## ✓ tidyr 1.1.3 ✓ forcats 0.5.1
## ✓ readr 1.4.0
## Warning: package 'tibble' was built under R version 4.1.1
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## x dplyr::filter() masks stats::filter()
## x dplyr::lag() masks stats::lag()
1. The carbon monoxide in cigarettes is thought to be hazardous to the fetus of a pregnant woman who smokes. In a study of this hypothesis, blood was drawn from pregnant women before and after smoking a cigarette. Measurements were made of the percent increase of blood hemoglobin bound to carbon monoxide (COHb). The results for 26 women are:
a <- c(6.4, 2.6, 3.5, 2.9, 3.9, 2.2, 5.5, 4.4, 3.5, 3.2, 2.8, 2.4, 3.5, 3.3, 3.7, 2.6, 3.5, 4.5, 4.2, 2.9, 3.1, 3.3, 4.3, 2.6, 4.1, 3.7)
NROW(a)
## [1] 26
summary(a)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 2.200 2.900 3.500 3.562 4.050 6.400
sd(a)
## [1] 0.9533423
4.050-2.900
## [1] 1.15
a. Find the mean, median, sample standard deviation, and IQR.
Mean \(\overline{y}\) = 3.564
Median = 3.5
Sample N = 26
Standard Deviation s = 0.9533423
IQR = 1.15
b, Create a boxplot of these observations.
boxplot(a)
c. Create a histogram of these observations.
hist(a)
2. A plant physiologist investigated the effect of light on the growth of soybean plants. 13 different types of soybean seedlings were randomly allocated to two treatments: low light and moderate light. After 16 days of growth, plants were harvested, and the total leaf area of each plant was measured.
| Â | Low Light | Moderate Light |
|---|---|---|
| 1 | 264 | 314 |
| 2 | 200 | 320 |
| 3 | 225 | 310 |
| 4 | 268 | 340 |
| 5 | 215 | 299 |
| 6 | 241 | 268 |
| 7 | 232 | 345 |
| 8 | 256 | 271 |
| 9 | 229 | 285 |
| 10 | 288 | 309 |
| 11 | 253 | 337 |
| 12 | 288 | 282 |
| 13 | 230 | 273 |
In the space below, create a scatterplot of the data. Try to include axes labels on each of the axes. If you can, overlay a regression line on your scatterplot.
low <- c(264, 200, 225, 268, 215, 241, 232, 256, 229, 288, 253, 288, 230)
moderate <- c(314, 320, 310, 340, 299, 268, 345, 271, 285, 309, 337, 282, 273)
soybean <- data.frame(low, moderate)
plot(low, moderate, main="Soybeen Plant 16 Day Growth \nLow & Moderate Light")
abline(lm(moderate ~ low))
The following histogram shows the same data that are shown in one of the four boxplots. Which boxplot (a, b, c, or d) goes with the histogram? Explain your answer.
C is the answer. The histogram shows a range of approximately 25 to 65. 95% of the observations are roughly between 30 and 55 with an interval length of 25 (55-30). Using the estimate of s formula, s is 6.25 (25/4). This gives a +- 3s, or +- 18.75, from the mean to cover >99% of the occurrences. This boxplot that best fits this description due to its larger IQR is boxplot C.
(55-30)/4
## [1] 6.25