The data set concerns species and weight of animals caught in plots in a study area in Arizona over time.
Each row holds information for a single animal, and the columns represent:
- record_id: Unique id for the observation
- month: month of observation
- day: day of observation
- year: year of observation
- plot_id: ID of a particular plot
- species_id: 2-letter code
- sex: sex of animal (“M”, “F”)
- hindfoot_length: length of the hindfoot in mm
- weight: weight of the animal in grams
- genus: genus of animal
- species: species of animal
- taxa: e.g. Rodent, Reptile, Bird, Rabbit
- plot_type: type of plot
使用pacman讀取套件
讀取csv資料
## Parsed with column specification:
## cols(
## record_id = col_double(),
## month = col_double(),
## day = col_double(),
## year = col_double(),
## plot_id = col_double(),
## species_id = col_character(),
## sex = col_character(),
## hindfoot_length = col_double(),
## weight = col_double(),
## genus = col_character(),
## species = col_character(),
## taxa = col_character(),
## plot_type = col_character()
## )
檢視資料(類似str)
## Observations: 34,786
## Variables: 13
## $ record_id <dbl> 1, 72, 224, 266, 349, 363, 435, 506, 588, 661, 748,...
## $ month <dbl> 7, 8, 9, 10, 11, 11, 12, 1, 2, 3, 4, 5, 6, 8, 9, 10...
## $ day <dbl> 16, 19, 13, 16, 12, 12, 10, 8, 18, 11, 8, 6, 9, 5, ...
## $ year <dbl> 1977, 1977, 1977, 1977, 1977, 1977, 1977, 1978, 197...
## $ plot_id <dbl> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, ...
## $ species_id <chr> "NL", "NL", "NL", "NL", "NL", "NL", "NL", "NL", "NL...
## $ sex <chr> "M", "M", NA, NA, NA, NA, NA, NA, "M", NA, NA, "M",...
## $ hindfoot_length <dbl> 32, 31, NA, NA, NA, NA, NA, NA, NA, NA, NA, 32, NA,...
## $ weight <dbl> NA, NA, NA, NA, NA, NA, NA, NA, 218, NA, NA, 204, 2...
## $ genus <chr> "Neotoma", "Neotoma", "Neotoma", "Neotoma", "Neotom...
## $ species <chr> "albigula", "albigula", "albigula", "albigula", "al...
## $ taxa <chr> "Rodent", "Rodent", "Rodent", "Rodent", "Rodent", "...
## $ plot_type <chr> "Control", "Control", "Control", "Control", "Contro...
依變項plot、species、wight預覽
## # A tibble: 6 x 3
## plot_id species_id weight
## <dbl> <chr> <dbl>
## 1 2 NL NA
## 2 2 NL NA
## 3 2 NL NA
## 4 2 NL NA
## 5 2 NL NA
## 6 2 NL NA
預覽(不要顯示record、pecies)
## # A tibble: 6 x 11
## month day year plot_id sex hindfoot_length weight genus species taxa
## <dbl> <dbl> <dbl> <dbl> <chr> <dbl> <dbl> <chr> <chr> <chr>
## 1 7 16 1977 2 M 32 NA Neot~ albigu~ Rode~
## 2 8 19 1977 2 M 31 NA Neot~ albigu~ Rode~
## 3 9 13 1977 2 <NA> NA NA Neot~ albigu~ Rode~
## 4 10 16 1977 2 <NA> NA NA Neot~ albigu~ Rode~
## 5 11 12 1977 2 <NA> NA NA Neot~ albigu~ Rode~
## 6 11 12 1977 2 <NA> NA NA Neot~ albigu~ Rode~
## # ... with 1 more variable: plot_type <chr>
預覽year=1995的資料
## # A tibble: 6 x 13
## record_id month day year plot_id species_id sex hindfoot_length weight
## <dbl> <dbl> <dbl> <dbl> <dbl> <chr> <chr> <dbl> <dbl>
## 1 22314 6 7 1995 2 NL M 34 NA
## 2 22728 9 23 1995 2 NL F 32 165
## 3 22899 10 28 1995 2 NL F 32 171
## 4 23032 12 2 1995 2 NL F 33 NA
## 5 22003 1 11 1995 2 DM M 37 41
## 6 22042 2 4 1995 2 DM F 36 45
## # ... with 4 more variables: genus <chr>, species <chr>, taxa <chr>,
## # plot_type <chr>
選取weight在5以內,依species、sex、weight預覽
## # A tibble: 6 x 3
## species_id sex weight
## <chr> <chr> <dbl>
## 1 PF M 5
## 2 PF F 5
## 3 PF F 5
## 4 PF F 4
## 5 PF F 5
## 6 PF F 4
選取weight在5以內,依species、sex、weight預覽(將語言分開來寫)
## # A tibble: 6 x 3
## species_id sex weight
## <chr> <chr> <dbl>
## 1 PF M 5
## 2 PF F 5
## 3 PF F 5
## 4 PF F 4
## 5 PF F 5
## 6 PF F 4
加入新變項:weight_kg、weight_lb
## # A tibble: 6 x 15
## record_id month day year plot_id species_id sex hindfoot_length weight
## <dbl> <dbl> <dbl> <dbl> <dbl> <chr> <chr> <dbl> <dbl>
## 1 1 7 16 1977 2 NL M 32 NA
## 2 72 8 19 1977 2 NL M 31 NA
## 3 224 9 13 1977 2 NL <NA> NA NA
## 4 266 10 16 1977 2 NL <NA> NA NA
## 5 349 11 12 1977 2 NL <NA> NA NA
## 6 363 11 12 1977 2 NL <NA> NA NA
## # ... with 6 more variables: genus <chr>, species <chr>, taxa <chr>,
## # plot_type <chr>, weight_kg <dbl>, weight_lb <dbl>
刪除體重NA,依sex、species列出體重平均
dta %>%
filter(!is.na(weight)) %>%
group_by(sex, species_id) %>%
summarize(mean_weight = mean(weight)) %>%
arrange(desc(mean_weight)) %>%
head()## # A tibble: 6 x 3
## # Groups: sex [3]
## sex species_id mean_weight
## <chr> <chr> <dbl>
## 1 <NA> NL 168.
## 2 M NL 166.
## 3 F NL 154.
## 4 M SS 130
## 5 <NA> SH 130
## 6 M DS 122.
依sex計算筆數
## # A tibble: 3 x 2
## sex n
## <chr> <int>
## 1 F 15690
## 2 M 17348
## 3 <NA> 1748
依sex計數
## # A tibble: 3 x 2
## sex n
## <chr> <int>
## 1 F 15690
## 2 M 17348
## 3 <NA> 1748
依sex統整總數 n()的功能類似tally
## # A tibble: 3 x 2
## sex count
## <chr> <int>
## 1 F 15690
## 2 M 17348
## 3 <NA> 1748
去除遺漏值得總數
## # A tibble: 3 x 2
## sex count
## <chr> <int>
## 1 F 15690
## 2 M 17348
## 3 <NA> 1748
刪除遺漏值、依genus、plot、mean_weight分類,新資料名為dta_gw
檢視dta_gw
## Observations: 196
## Variables: 3
## Groups: genus [10]
## $ genus <chr> "Baiomys", "Baiomys", "Baiomys", "Baiomys", "Baiomys", ...
## $ plot_id <dbl> 1, 2, 3, 5, 18, 19, 20, 21, 1, 2, 3, 4, 5, 6, 7, 8, 9, ...
## $ mean_weight <dbl> 7.000000, 6.000000, 8.611111, 7.750000, 9.500000, 9.533...
資料長寬倒轉,依照genus顯示mean_weight,新資料名為dta_w
檢視資料
## Observations: 24
## Variables: 11
## $ plot_id <dbl> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, ...
## $ Baiomys <dbl> 7.000000, 6.000000, 8.611111, NA, 7.750000, NA, NA,...
## $ Chaetodipus <dbl> 22.19939, 25.11014, 24.63636, 23.02381, 17.98276, 2...
## $ Dipodomys <dbl> 60.23214, 55.68259, 52.04688, 57.52454, 51.11356, 5...
## $ Neotoma <dbl> 156.2222, 169.1436, 158.2414, 164.1667, 190.0370, 1...
## $ Onychomys <dbl> 27.67550, 26.87302, 26.03241, 28.09375, 27.01695, 2...
## $ Perognathus <dbl> 9.625000, 6.947368, 7.507812, 7.824427, 8.658537, 7...
## $ Peromyscus <dbl> 22.22222, 22.26966, 21.37037, 22.60000, 21.23171, 2...
## $ Reithrodontomys <dbl> 11.375000, 10.680556, 10.516588, 10.263158, 11.1545...
## $ Sigmodon <dbl> NA, 70.85714, 65.61404, 82.00000, 82.66667, 68.7777...
## $ Spermophilus <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,...
長寬倒轉,顯示genus, mean_weight
## # A tibble: 6 x 11
## plot_id Baiomys Chaetodipus Dipodomys Neotoma Onychomys Perognathus Peromyscus
## <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 1 7 22.2 60.2 156. 27.7 9.62 22.2
## 2 2 6 25.1 55.7 169. 26.9 6.95 22.3
## 3 3 8.61 24.6 52.0 158. 26.0 7.51 21.4
## 4 4 0 23.0 57.5 164. 28.1 7.82 22.6
## 5 5 7.75 18.0 51.1 190. 27.0 8.66 21.2
## 6 6 0 24.9 58.6 180. 25.9 7.81 21.8
## # ... with 3 more variables: Reithrodontomys <dbl>, Sigmodon <dbl>,
## # Spermophilus <dbl>
長寬倒轉,依照plot_id、genus、mean_weight顯示
檢視資料
## Observations: 240
## Variables: 3
## $ plot_id <dbl> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, ...
## $ genus <chr> "Baiomys", "Baiomys", "Baiomys", "Baiomys", "Baiomys", ...
## $ mean_weight <dbl> 7.000000, 6.000000, 8.611111, NA, 7.750000, NA, NA, NA,...
顯示genus類別Baiomys~Spermophilus的mean_weight
## # A tibble: 6 x 3
## plot_id genus mean_weight
## <dbl> <chr> <dbl>
## 1 1 Baiomys 7
## 2 2 Baiomys 6
## 3 3 Baiomys 8.61
## 4 4 Baiomys NA
## 5 5 Baiomys 7.75
## 6 6 Baiomys NA
篩選n少50的species,個別記數
## # A tibble: 14 x 2
## species_id n
## <chr> <int>
## 1 DM 9727
## 2 DO 2790
## 3 DS 2023
## 4 NL 1045
## 5 OL 905
## 6 OT 2081
## 7 PB 2803
## 8 PE 1198
## 9 PF 1469
## 10 PM 835
## 11 PP 2969
## 12 RF 73
## 13 RM 2417
## 14 SH 128