Exercise 1: Student ID

Load file

Change the data structure from factor to character

add missing ID, then order as requirement

dta <- read.table("/Users/haolunfu/Documents/資料管理/week4/student2017.txt", header = T)
dtac <- dta
dtac$ID <- as.character(dtac$ID)
dtac[17,1] <- "U76067010"
dtac <- dtac[c(1:14,17,15:16),1]
dtac

##  [1] "D84057058" "C44035023" "D84041162" "D84046081" "D84021057" "U36037025"
##  [7] "U36041074" "U36041090" "U36051087" "U36031118" "U36041082" "U76051019"
## [13] "U76054025" "U76064062" "U76067010" "U76041064" "U76041080"

Exercise 2: Women Height and weight

Load data file

dta <- datasets::women
str(dta)

## 'data.frame':    15 obs. of  2 variables:
##  $ height: num  58 59 60 61 62 63 64 65 66 67 ...
##  $ weight: num  115 117 120 123 126 129 132 135 139 142 ...

Change the first women’s height to 50

dta[1,1] <- 50
head(dta)

##   height weight
## 1     50    115
## 2     59    117
## 3     60    120
## 4     61    123
## 5     62    126
## 6     63    129

Plot the data

plot(women)

Note: I’m sorry. I can’t run data.entry() function in my laptop(mac OStave system), even though I have installed XQuart application. Therefore, I couldn’t answer the question.

Exercise 3: Google scholar citation

These is Dr. Chun-Hao Wang’s Google scholar citation.

pacman::p_load(scholar)
dta <- get_citation_history('uLI6UcEAAAAJ')
plot(dta, xlab="Year", ylab="Citations", type='h', lwd=2, xaxt="n", xlim = c(2010, 2020))
axis(side=1, at=seq(2010, 2020, by=1), cex.axis=0.7)
abline(h=seq(0, 200, by=100), lty=3, col="gray")

Exercise 4: Body temperature, gender, and heart rate

Load file

Because I don’t like any space in my data frame, just change the factor name.

pacman::p_load(readxl, httr)
dta <- read_excel("/Users/haolunfu/Documents/資料管理/week4/NORMTEMP.xls")
str(dta)

## Classes 'tbl_df', 'tbl' and 'data.frame':    130 obs. of  3 variables:
##  $ Body Temp : num  96.3 96.7 96.9 97 97.1 97.1 97.1 97.2 97.3 97.4 ...
##  $ Gender    : num  2 2 2 2 2 2 2 2 2 2 ...
##  $ Heart Rate: num  70 71 74 80 73 75 82 64 69 70 ...

dta$BodyTemp <- dta$`Body Temp`
dta$HeartRate <- dta$`Heart Rate`
dta <- dta[,c(-1,-3)]
str(dta)

## Classes 'tbl_df', 'tbl' and 'data.frame':    130 obs. of  3 variables:
##  $ Gender   : num  2 2 2 2 2 2 2 2 2 2 ...
##  $ BodyTemp : num  96.3 96.7 96.9 97 97.1 97.1 97.1 97.2 97.3 97.4 ...
##  $ HeartRate: num  70 71 74 80 73 75 82 64 69 70 ...

Calculate Pearson’s correlation between Body temperature and heart rate

cor(x= dta$BodyTemp, y= dta$HeartRate, method = "pearson")

## [1] 0.2536564

examine gender effect in Body temperature

t.test(BodyTemp ~ Gender, data= dta)

## 
##  Welch Two Sample t-test
## 
## data:  BodyTemp by Gender
## t = 2.2854, df = 127.51, p-value = 0.02394
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  0.03881298 0.53964856
## sample estimates:
## mean in group 1 mean in group 2 
##        98.39385        98.10462

Exercise 5: AAUP2 data

Load data file

dta <- read.table("/Users/haolunfu/Documents/資料管理/week4/aaup2.dat.txt", header = F , sep = ",", na.strings = "*", fill=TRUE)
head(dta)

##                                                                                                    V1
## 1  1061 Alaska Pacific University      AK IIB  454 382 362 382  567 485 471  487   6  11   9   4   32
## 2  1063 Univ.Alaska-Fairbanks          AK I    686 560 432 508  914 753 572  677  74 125 118  40  404
## 3  1065 Univ.Alaska-Southeast          AK IIA  533 494 329 415  716 663 442  559   9  26  20   9   70
## 4 11462 Univ.Alaska-Anchorage          AK IIA  612 507 414 498  825 681 557  670 115 124 101  21  392
## 5  1002 Alabama Agri.&Mech. Univ.      AL IIA  442 369 310 350  530 444 376  423  59  77 102  24  262
## 6  1004 University of Montevallo       AL IIA  441 385 310 388  542 473 383  477  57  33  35   2  127

readr::fwf_empty("/Users/haolunfu/Documents/資料管理/week4/aaup2.dat.txt")[1:2]

## $begin
##  [1]  0  6 40 45 49 53 57 61 66 70 74 79 83 87 92 95
## 
## $end
##  [1]  5 39 43 48 52 56 60 65 69 73 78 82 86 90 94 NA

dta <- readr::read_fwf("/Users/haolunfu/Documents/資料管理/week4/aaup2.dat.txt",
                       readr::fwf_cols(V1=5, V2=32, V3=2, V4=4, V5=5, V6=4, V7=4, V8=4,
                                       V9=5, V10=4, V11=4, V12=5, V13=4, V14=4, V15=3))

## Parsed with column specification:
## cols(
##   V1 = col_double(),
##   V2 = col_character(),
##   V3 = col_character(),
##   V4 = col_character(),
##   V5 = col_character(),
##   V6 = col_character(),
##   V7 = col_character(),
##   V8 = col_double(),
##   V9 = col_character(),
##   V10 = col_character(),
##   V11 = col_character(),
##   V12 = col_double(),
##   V13 = col_double(),
##   V14 = col_double(),
##   V15 = col_double()
## )

head(dta)

## # A tibble: 6 x 15
##      V1 V2     V3    V4    V5    V6    V7       V8 V9    V10   V11     V12   V13
##   <dbl> <chr>  <chr> <chr> <chr> <chr> <chr> <dbl> <chr> <chr> <chr> <dbl> <dbl>
## 1  1061 Alask… AK    IIB   454   382   362     382 567   485   471     487     6
## 2  1063 Univ.… AK    I     686   560   432     508 914   753   572     677    74
## 3  1065 Univ.… AK    IIA   533   494   329     415 716   663   442     559     9
## 4 11462 Univ.… AK    IIA   612   507   414     498 825   681   557     670   115
## 5  1002 Alaba… AL    IIA   442   369   310     350 530   444   376     423    59
## 6  1004 Unive… AL    IIA   441   385   310     388 542   473   383     477    57
## # … with 2 more variables: V14 <dbl>, V15 <dbl>

Actually, I tried two methods to load the data set, but unfortunately, neither method 1 nor method 2 can correctly load the data set.

Week4 In-class exercise

Hao-Lun Fu

2020-03-24

Exercise 1: Student ID

Load file

Change the data structure from factor to character

add missing ID, then order as requirement

Exercise 2: Women Height and weight

Load data file

Change the first women’s height to 50

Plot the data

Note: I’m sorry. I can’t run data.entry() function in my laptop(mac OStave system), even though I have installed XQuart application. Therefore, I couldn’t answer the question.

Exercise 3: Google scholar citation

These is Dr. Chun-Hao Wang’s Google scholar citation.

Exercise 4: Body temperature, gender, and heart rate

Load file

Because I don’t like any space in my data frame, just change the factor name.

Calculate Pearson’s correlation between Body temperature and heart rate

examine gender effect in Body temperature

Exercise 5: AAUP2 data

Load data file

Actually, I tried two methods to load the data set, but unfortunately, neither method 1 nor method 2 can correctly load the data set.