In-class exercise 1.

The following student ID file is missing an ID U76067010 at the third to the last position. Find a way to fix it? Dowload and display the data contents in R.

[Solution and Answer]


In-class exercise 2.

A classmate of yours used data.entry() to change the first woman’s height to 50 in the women{datasets}. She then closed the editor and issued plot(women). To her surprise, she got this message:

Error in xy.coords(x, y, xlabel, ylabel, log) :
‘x’ is a list, but does not have components ‘x’ and ’y

[Solution and Answer]

[1] "list"

Since using data.entry(women) will make the data frame women turn into a list, the command plot(women) can not recognize what variable in women x and y. Hence, we should use edit(women) instead of data.entry(women). We also can spicify x and y when calling plot():


In-class exercise 4.

Data on body temperature, gender, and heart rate. are taken from Mackowiak et al. (1992). “A Critical Appraisal of 98.6 Degrees F …,” in the Journal of the American Medical Association (268), 1578-80. Import the file. Find the correlation between body temperature and heart rate and investigate if there is a gender difference in mean temperature.

[Solution and Answer]

Classes 'tbl_df', 'tbl' and 'data.frame':   130 obs. of  3 variables:
 $ Temp : num  96.3 96.7 96.9 97 97.1 97.1 97.1 97.2 97.3 97.4 ...
 $ Sex  : num  1 1 1 1 1 1 1 1 1 1 ...
 $ Beats: num  70 71 74 80 73 75 82 64 69 70 ...

[1] 0.2536564

There is a slightly positively correlation between body temperature and heart rate.

For the investigation of gender difference in mean temperature, we conduct an independent two sample t-test.


    Welch Two Sample t-test

data:  Temp by Sex
t = -2.2854, df = 127.51, p-value = 0.02394
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -0.53964856 -0.03881298
sample estimates:
mean in group 1 mean in group 2 
       98.10462        98.39385 

Since \(p<\alpha=.05\), we reject the null hypothesis. There is a significant gender difference in body temperature.


In-class exercise 5.

The AAUP2 data set is a comma-delimited fixed column format text file with '*' for missing value. Import the file into R and indicate missing values by 'NA'. Hint: ?read.csv

[Solution and Answer]

Parsed with column specification:
cols(
  X1 = col_double(),
  X2 = col_character(),
  X3 = col_character(),
  X4 = col_character(),
  X5 = col_character(),
  X6 = col_double(),
  X7 = col_double(),
  X8 = col_character(),
  X9 = col_character(),
  X10 = col_double(),
  X11 = col_character(),
  X12 = col_character(),
  X13 = col_double(),
  X14 = col_double(),
  X15 = col_double(),
  X16 = col_double(),
  X17 = col_double()
)

AAUP2 is not comma-delimited actually. It is delimited without a regular form thus we should use read_fwf{readr} to load it correctly. Also, the arguement na = '*' can work to make missing values diplay in NA.