I. Learning data.table

次を参考にしてRのdata.tableを学びます。

base vs tidyverse vs data.table

data.table in R – The Complete Beginners Guide

data.table

II. Syntax

Tableのシンタクスは次です。

dt[行, 列, by]

III.行の取り出し

1.データの読み込み

試験結果のサンプル・データ(CSV)をデータ・フレームで読み込み、dta.tableに変換します。

# import a library
library(data.table)
# load the data
df <- read.csv("https://pastebin.com/raw/nWkAe1qR")
# convert the data frme frame to data.table
dt <- as.data.table(df)
dt
##       id english japanese nationality department year gender
##   1:   1    17.8     75.6    japanese literature    1   male
##   2:   2    64.4     53.3       nepal literature    1   male
##   3:   3    86.7     31.1       nepal literature    1   male
##   4:   4    60.0     62.2   indonesia literature    1   male
##   5:   5    42.2     80.0    japanese literature    1   male
##  ---                                                        
## 196: 196    66.7     55.6       nepal  economics    1   male
## 197: 197    44.4     80.0       china  economics    1   male
## 198: 198    57.8     48.9     vietnam  economics    1 female
## 199: 199    86.7     26.7     vietnam  economics    1   male
## 200: 200    24.4     68.9    japanese  economics    1   male

2.条件による取り出し

englishが60より高い行を取り出すスクリプトは次です。

dt[english >60,]
##      id english japanese nationality department year gender
##  1:   2    64.4     53.3       nepal literature    1   male
##  2:   3    86.7     31.1       nepal literature    1   male
##  3:  14    62.2     86.7    japanese literature    1   male
##  4:  15    62.2     71.1    japanese literature    1   male
##  5:  16    62.2     86.7    japanese literature    1   male
##  6:  18    84.4     57.8       nepal literature    1   male
##  7:  19    73.3     84.4       china literature    1 female
##  8:  21    64.4     31.1       nepal literature    1   male
##  9:  22    71.1     75.6    japanese literature    1 female
## 10:  25    84.4     33.3       nepal literature    1   male
## 11:  26    77.8     77.8       nepal literature    1   male
## 12:  31    66.7     28.9       nepal literature    1   male
## 13:  35    62.2     82.2    japanese literature    1   male
## 14:  37    71.1     60.0     vietnam literature    1   male
## 15:  45    80.0     57.8       nepal literature    1   male
## 16:  46    93.3     55.6       nepal literature    1   male
## 17:  54    62.2     82.2    japanese literature    1   male
## 18:  55    64.4     82.2    japanese literature    1 female
## 19:  60    82.2     77.8       nepal literature    1   male
## 20:  67    80.0     37.8       nepal  economics    1   male
## 21:  89    64.4     86.7    japanese  economics    1 female
## 22:  94    88.9     33.3       nepal  economics    1   male
## 23: 103    62.2     55.6     vietnam  economics    1 female
## 24: 119    86.7     64.4       nepal  economics    1   male
## 25: 132    71.1     84.4    japanese  economics    1 female
## 26: 133    64.4     53.3       nepal  economics    1   male
## 27: 134    80.0     53.3       nepal  economics    1   male
## 28: 135    75.6     68.9       nepal  economics    1   male
## 29: 139    73.3     77.8     vietnam  economics    1 female
## 30: 141    86.7     80.0       china  economics    1   male
## 31: 144    62.2     55.6     vietnam  economics    1 female
## 32: 147    62.2     77.8    japanese  economics    1   male
## 33: 195    84.4     28.9       nepal  economics    1   male
## 34: 196    66.7     55.6       nepal  economics    1   male
## 35: 199    86.7     26.7     vietnam  economics    1   male
##      id english japanese nationality department year gender

nationalityがnepalの学生を取り出すスクリプトは次です。

特定の値は==です。=でないことに注意してください。=は「名付け」を意味します。

列名に" "は不要です。

dt[nationality == "nepal"]
##      id english japanese nationality department year gender
##  1:   2    64.4     53.3       nepal literature    1   male
##  2:   3    86.7     31.1       nepal literature    1   male
##  3:  18    84.4     57.8       nepal literature    1   male
##  4:  21    64.4     31.1       nepal literature    1   male
##  5:  25    84.4     33.3       nepal literature    1   male
##  6:  26    77.8     77.8       nepal literature    1   male
##  7:  31    66.7     28.9       nepal literature    1   male
##  8:  45    80.0     57.8       nepal literature    1   male
##  9:  46    93.3     55.6       nepal literature    1   male
## 10:  60    82.2     77.8       nepal literature    1   male
## 11:  67    80.0     37.8       nepal  economics    1   male
## 12:  68    35.6     35.6       nepal  economics    1   male
## 13:  94    88.9     33.3       nepal  economics    1   male
## 14:  98    48.9     40.0       nepal  economics    1   male
## 15: 119    86.7     64.4       nepal  economics    1   male
## 16: 120    42.2     24.4       nepal  economics    1   male
## 17: 133    64.4     53.3       nepal  economics    1   male
## 18: 134    80.0     53.3       nepal  economics    1   male
## 19: 135    75.6     68.9       nepal  economics    1   male
## 20: 161    57.8     60.0       nepal  economics    1   male
## 21: 162    48.9     60.0       nepal  economics    1   male
## 22: 172    44.4     20.0       nepal  economics    1   male
## 23: 173    60.0     22.2       nepal  economics    1   male
## 24: 185    53.3     28.9       nepal  economics    1   male
## 25: 195    84.4     28.9       nepal  economics    1   male
## 26: 196    66.7     55.6       nepal  economics    1   male
##      id english japanese nationality department year gender

IV.列の取り出し

englishとjapaneseとnationality列を取り出すスクリプトは次です。

dt[,.(english, japanese, nationality)]
##      english japanese nationality
##   1:    17.8     75.6    japanese
##   2:    64.4     53.3       nepal
##   3:    86.7     31.1       nepal
##   4:    60.0     62.2   indonesia
##   5:    42.2     80.0    japanese
##  ---                             
## 196:    66.7     55.6       nepal
## 197:    44.4     80.0       china
## 198:    57.8     48.9     vietnam
## 199:    86.7     26.7     vietnam
## 200:    24.4     68.9    japanese

englishの平均を算出します。

dt[, .(mean = mean(english))]
##      mean
## 1: 42.931

japaneseの平均を算出します。

dt[, .(mean = mean(japanese))]
##      mean
## 1: 63.454

To be continued.