Exercise 1: Student ID
Load file
Change the data structure from factor to character
add missing ID, then order as requirement
## [1] "D84057058" "C44035023" "D84041162" "D84046081" "D84021057" "U36037025"
## [7] "U36041074" "U36041090" "U36051087" "U36031118" "U36041082" "U76051019"
## [13] "U76054025" "U76064062" "U76067010" "U76041064" "U76041080"
Exercise 2: Women Height and weight
Load data file
## 'data.frame': 15 obs. of 2 variables:
## $ height: num 58 59 60 61 62 63 64 65 66 67 ...
## $ weight: num 115 117 120 123 126 129 132 135 139 142 ...
Change the first women’s height to 50
## height weight
## 1 50 115
## 2 59 117
## 3 60 120
## 4 61 123
## 5 62 126
## 6 63 129
Plot the data

Note: I’m sorry. I can’t run data.entry() function in my laptop(mac OStave system), even though I have installed XQuart application. Therefore, I couldn’t answer the question.
Exercise 3: Google scholar citation
These is Dr. Chun-Hao Wang’s Google scholar citation.
pacman::p_load(scholar)
dta <- get_citation_history('uLI6UcEAAAAJ')
plot(dta, xlab="Year", ylab="Citations", type='h', lwd=2, xaxt="n", xlim = c(2010, 2020))
axis(side=1, at=seq(2010, 2020, by=1), cex.axis=0.7)
abline(h=seq(0, 200, by=100), lty=3, col="gray")

Exercise 4: Body temperature, gender, and heart rate
Load file
Because I don’t like any space in my data frame, just change the factor name.
## Classes 'tbl_df', 'tbl' and 'data.frame': 130 obs. of 3 variables:
## $ Body Temp : num 96.3 96.7 96.9 97 97.1 97.1 97.1 97.2 97.3 97.4 ...
## $ Gender : num 2 2 2 2 2 2 2 2 2 2 ...
## $ Heart Rate: num 70 71 74 80 73 75 82 64 69 70 ...
## Classes 'tbl_df', 'tbl' and 'data.frame': 130 obs. of 3 variables:
## $ Gender : num 2 2 2 2 2 2 2 2 2 2 ...
## $ BodyTemp : num 96.3 96.7 96.9 97 97.1 97.1 97.1 97.2 97.3 97.4 ...
## $ HeartRate: num 70 71 74 80 73 75 82 64 69 70 ...
Calculate Pearson’s correlation between Body temperature and heart rate
## [1] 0.2536564
examine gender effect in Body temperature
##
## Welch Two Sample t-test
##
## data: BodyTemp by Gender
## t = 2.2854, df = 127.51, p-value = 0.02394
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## 0.03881298 0.53964856
## sample estimates:
## mean in group 1 mean in group 2
## 98.39385 98.10462
Exercise 5: AAUP2 data
Load data file
## V1
## 1 1061 Alaska Pacific University AK IIB 454 382 362 382 567 485 471 487 6 11 9 4 32
## 2 1063 Univ.Alaska-Fairbanks AK I 686 560 432 508 914 753 572 677 74 125 118 40 404
## 3 1065 Univ.Alaska-Southeast AK IIA 533 494 329 415 716 663 442 559 9 26 20 9 70
## 4 11462 Univ.Alaska-Anchorage AK IIA 612 507 414 498 825 681 557 670 115 124 101 21 392
## 5 1002 Alabama Agri.&Mech. Univ. AL IIA 442 369 310 350 530 444 376 423 59 77 102 24 262
## 6 1004 University of Montevallo AL IIA 441 385 310 388 542 473 383 477 57 33 35 2 127
## $begin
## [1] 0 6 40 45 49 53 57 61 66 70 74 79 83 87 92 95
##
## $end
## [1] 5 39 43 48 52 56 60 65 69 73 78 82 86 90 94 NA
dta <- readr::read_fwf("/Users/haolunfu/Documents/資料管理/week4/aaup2.dat.txt",
readr::fwf_cols(V1=5, V2=32, V3=2, V4=4, V5=5, V6=4, V7=4, V8=4,
V9=5, V10=4, V11=4, V12=5, V13=4, V14=4, V15=3))
## Parsed with column specification:
## cols(
## V1 = col_double(),
## V2 = col_character(),
## V3 = col_character(),
## V4 = col_character(),
## V5 = col_character(),
## V6 = col_character(),
## V7 = col_character(),
## V8 = col_double(),
## V9 = col_character(),
## V10 = col_character(),
## V11 = col_character(),
## V12 = col_double(),
## V13 = col_double(),
## V14 = col_double(),
## V15 = col_double()
## )
## # A tibble: 6 x 15
## V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 V11 V12 V13
## <dbl> <chr> <chr> <chr> <chr> <chr> <chr> <dbl> <chr> <chr> <chr> <dbl> <dbl>
## 1 1061 Alask… AK IIB 454 382 362 382 567 485 471 487 6
## 2 1063 Univ.… AK I 686 560 432 508 914 753 572 677 74
## 3 1065 Univ.… AK IIA 533 494 329 415 716 663 442 559 9
## 4 11462 Univ.… AK IIA 612 507 414 498 825 681 557 670 115
## 5 1002 Alaba… AL IIA 442 369 310 350 530 444 376 423 59
## 6 1004 Unive… AL IIA 441 385 310 388 542 473 383 477 57
## # … with 2 more variables: V14 <dbl>, V15 <dbl>
Actually, I tried two methods to load the data set, but unfortunately, neither method 1 nor method 2 can correctly load the data set.