4/27exercise1-3

Exercice1:Split the ChickWeight{datasets} data by individual chicks to extract separate slope estimates of regressing weight onto Time for each chick

# load and draw
library(datasets)
dta <- datasets::ChickWeight
summary(dta)

##      weight           Time           Chick     Diet   
##  Min.   : 35.0   Min.   : 0.00   13     : 12   1:220  
##  1st Qu.: 63.0   1st Qu.: 4.00   9      : 12   2:120  
##  Median :103.0   Median :10.00   20     : 12   3:120  
##  Mean   :121.8   Mean   :10.72   10     : 12   4:118  
##  3rd Qu.:163.8   3rd Qu.:16.00   17     : 12          
##  Max.   :373.0   Max.   :21.00   19     : 12          
##                                  (Other):506

library(lattice)
stripplot(weight ~ Time | Chick, 
          data=dta,
          pch=16, 
          col="pink", 
          xlab="Time",
          ylab="weight",
          par.settings=standard.theme(color=FALSE))

Exercise2: Explain what does this statement do:lapply(lapply(search(), ls), length)

# 按每個package顯示函數或數據集的數目
search()

##  [1] ".GlobalEnv"        "package:lattice"   "package:stats"    
##  [4] "package:graphics"  "package:grDevices" "package:utils"    
##  [7] "package:datasets"  "package:methods"   "Autoloads"        
## [10] "package:base"

summary(lapply(search(), ls))

##       Length Class  Mode     
##  [1,]    1   -none- character
##  [2,]  151   -none- character
##  [3,]  448   -none- character
##  [4,]   87   -none- character
##  [5,]  109   -none- character
##  [6,]  215   -none- character
##  [7,]  104   -none- character
##  [8,]  218   -none- character
##  [9,]    0   -none- character
## [10,] 1229   -none- character

lapply(lapply(search(), ls), length)

## [[1]]
## [1] 1
## 
## [[2]]
## [1] 151
## 
## [[3]]
## [1] 448
## 
## [[4]]
## [1] 87
## 
## [[5]]
## [1] 109
## 
## [[6]]
## [1] 215
## 
## [[7]]
## [1] 104
## 
## [[8]]
## [1] 218
## 
## [[9]]
## [1] 0
## 
## [[10]]
## [1] 1229

Exercise3:The following R script uses Cushings{MASS} to demonstrates several ways to achieve the same objective in R. Explain the advantages or disadvantages of each method.

library(pacman)

pacman::p_load(MASS, tidyverse)

#Method 1按總函數按類型計算平均分數。 總結是強有力的,因爲語法容易理解,而且輸出可以作爲一個數據框架來讀取。
aggregate( . ~ Type, data = Cushings, mean)

#Method 2 使用小巧的功能來計算平均分數,把數據分成列表並給出矩陣中的輸出。 當輸入是一個列表,而期望的輸出是一個向量或矩陣時,這種輸入是可用的。
sapply(split(Cushings[,-3], Cushings$Type), function(x) apply(x, 2, mean))

##                            a    b     c        u
## Tetrahydrocortisone 2.966667 8.18 19.72 14.01667
## Pregnanetriol       2.440000 1.12  5.50  1.20000

##Method 3:創建一個用戶定義的函數,其功能是,將所選列的列子化,並按列子計算方法得分,將數值作爲列表,然後逐行合併。 函數()很強大,我們能夠定義完成一個數元的函數,但在此情況下,語法較其他相對困難。
do.call("rbind", as.list(
  by(Cushings, list(Cushings$Type), function(x) {
    y <- subset(x, select =  -Type)
    apply(y, 2, mean)
  }
)))

##   Tetrahydrocortisone Pregnanetriol
## a            2.966667          2.44
## b            8.180000          1.12
## c           19.720000          5.50
## u           14.016667          1.20

## Method 4:輸出的資料的格式，我們可以設置可變名稱,更容易閱讀。
Cushings %>%
group_by(Type) %>%
 summarize( t_m = mean(Tetrahydrocortisone), p_m = mean(Pregnanetriol))

##Method 5: 創建包含所有嵌套變量的數據幀列表。 它創造了三個新的變量,但結果沒有顯示"avg"。
Cushings %>%
 nest(-Type) %>%
 mutate(avg = map(data, ~ apply(., 2, mean)), 
        res_1 = map_dbl(avg, "Tetrahydrocortisone"), 
        res_2 = map_dbl(avg, "Pregnanetriol"))

## Warning: All elements of `...` must be named.
## Did you want `data = c(Tetrahydrocortisone, Pregnanetriol)`?