本週上課將介紹R
的資料讀取、資料匯出、基本指令,也就是R
可以讀取的外部資料,包括統計軟體、試算表軟體、編輯軟體、網路等等資料。
在開始之前,我們先介紹here這個套件。我們用這個套件指定特定的路徑。我們在你的R project所在的路徑下面開一個新資料夾data,然後把資料存到data這個資料夾。例如檔案Mystata.dta
存在<D:/MyR/data>
,可以用here::here("data","Mystata.dta")
這個指令來指定你的檔案路徑:
## 'data.frame': 1244 obs. of 21 variables:
## $ Q1 : int 0 0 12 1 0 0 1 0 0 11 ...
## $ Q2 : int 1 1 2 2 2 2 2 2 2 1 ...
## $ Q3 : int 1 1 1 1 1 1 1 98 96 1 ...
## $ Q4 : int 2 2 1 3 1 4 4 98 3 3 ...
## $ Q5 : int 2 3 2 2 2 2 3 2 3 2 ...
## $ SEX : int 2 2 1 2 1 2 1 2 2 2 ...
## $ AGE : int 2 4 5 5 4 4 2 5 1 2 ...
## $ EDU : int 5 3 2 3 3 3 4 3 5 3 ...
## $ TOWNID : int 6305 6608 6302 911 904 6628 6304 6707 6303 6515 ...
## $ AREAR : int 1 4 1 5 5 4 1 5 1 2 ...
## $ SENGI : int 2 3 2 2 2 2 2 2 2 3 ...
## $ T_Cidentity: int 1 2 1 1 3 3 2 9 9 3 ...
## $ partyid : Factor w/ 8 levels "KMT","DPP","NP",..: 2 7 1 1 1 1 8 7 7 7 ...
## $ PARTY : int 5 99 2 3 2 2 99 99 99 99 ...
## $ tondu : int 3 4 2 4 4 2 3 9 3 4 ...
## $ tondu3 : int 2 2 1 2 2 1 2 9 2 2 ...
## $ peace : num 8 4 5 4 4 6 5 NA 7 8 ...
## $ visit : num 0 0 12 1 0 0 1 0 0 11 ...
## $ experience : num 1 1 0 0 0 0 0 0 0 1 ...
## $ tondu3_new : num 1 1 2 1 1 2 1 NA 1 1 ...
## $ consensus : num 2 4 4 3 4 3 2 NA 2 4 ...
## - attr(*, "datalabel")= chr ""
## - attr(*, "time.stamp")= chr " 7 Mar 2018 16:06"
## - attr(*, "formats")= chr "%10.0g" "%10.0g" "%10.0g" "%10.0g" ...
## - attr(*, "types")= int 65530 65530 65530 65530 65530 65530 65530 65530 65529 65530 ...
## - attr(*, "val.labels")= Named chr "" "" "" "" ...
## ..- attr(*, "names")= chr "" "" "" "" ...
## - attr(*, "var.labels")= chr "Q1" "Q2" "Q3" "Q4" ...
## - attr(*, "version")= int 118
## - attr(*, "label.table")=List of 4
## ..$ cat : Named int 1 2 3 4 5 6 7 9
## .. ..- attr(*, "names")= chr "KMT" "DPP" "NP" "PFP" ...
## ..$ yesno : Named int 0 1
## .. ..- attr(*, "names")= chr "No" "Yes"
## ..$ tondu3_new: Named int 1 2 3
## .. ..- attr(*, "names")= chr "Status quo" "Unification" "Independence"
## ..$ support : Named int 1 2 3 4
## .. ..- attr(*, "names")= chr "Strongly dissupport" "Dissupport" "Support" "Strongly support"
## - attr(*, "expansion.fields")= list()
## - attr(*, "byteorder")= chr "LSF"
## - attr(*, "orig.dim")= int 1244 21
scan()這個函數類似read.table,它可以讀取外部資料轉成向量,但是無法讀取表格,是一個處理簡單資料的指令,首先以數值資料舉例:
file<-here::here("data","voteshare")
scan(file, comment.char = '#', dec='.')
[1] 55.6 66.1 36.8 65.1 50.9 44.9 48.7 52.4 48.5 53.0 51.9
R
把前面有'#'視為文字說明,不是資料。
lattice<-here::here('data','latticegraph')
scan(lattice, what=(""), comment.char = '#', sep=',')
[1] "barchart" "bwplot" "cloud" "contourplot" "densityplot" [6] "dotplot" "histogram" "levelplot" "parallel" "splom"
[11] "stripplot" "xyplot" "wireframe" "1" "1"
[16] "3" "3" "2" "1" "1"
[21] "2" "4" "4" "2" "2"
[26] "3"
data.table
的setDT()
函數轉為資料框。
data.table::setDT(tmp)
mutate
直接轉換為數值變數。
'data.frame': 10 obs. of 2 variables: $ company : chr "Apple" "Alphabet" "Microsoft" "Amazon" ... $ marketvalue: num 851 719 703 701 496 492 470 464 375 344
scan()
的功能比較簡單,只能讀進向量資料。而以下介紹的read.csv()
或是read.table()
的功能比較完備。
CSV<-here::here("data","councilor.csv")
csv1<-read.csv(CSV,
header=TRUE, sep=",", fileEncoding = 'BIG5')
head(csv1)
## Year budget unit contracter open
## 1 2015 676 水利處 台球 Yes
## 2 2016 673 新建工程處 茂盛 Yes
## 3 2016 270 新建工程處 冠君 Yes
## 4 2016 255 新建工程處 金煌 Yes
## 5 2016 235 新建工程處 聖鋒 Yes
## 6 2016 190 新建工程處 福呈 No
header=TRUE
表示第一列被認為是變數名稱,而sep
規範分隔的符號,fileEncoding=BIG5
則是將文字以BIG5編碼顯示中文。
R
讓使用者控制資料中的字串是否視為因素資料,也就是用stringAsFactors
控制:
csv2<-read.csv(CSV,
header=TRUE, sep=",",
fileEncoding = 'BIG5',
stringsAsFactors = F)
class(csv1$unit); table(csv1$unit)
[1] "factor"
公園處 水利處 新建工程處
1 1 8
class(csv2$unit)
[1] "character"
Figure 2.1: 字型測試
ggplot2
的繪圖功能:
Figure 2.2: ggplot2例子
read.table()可以讀取用txt格式儲存的表格資料,該資料的欄位用空白區隔,例如:
file<-here::here("data","Studentsfull.txt")
students<-read.table(file, header=TRUE, sep="")
head(students)
ID Name Department Score Gender
1 10322011 Ariel Aerospace 78 F 2 10325023 Becky Physics 86 F 3 10430101 Carl Journalism 69 M 4 10401032 Dimitri English 83 M 5 10307120 Enrique Chemistry 80 M 6 10207005 Fernando Chemistry 66 M
☛請用read.table()讀取上述台北市議員的資料。
Stata除了本身特有的dta資料檔之外,也可以儲存資料為csv檔或其他格式,R
有套件可以直接讀取。Stata的12版以前資料可以用foreign
這個套件其中的read.dta()。 如果讀取Stata 的13版以後的資料需要readstata13這個套件:
library(readstata13)
my<-here::here("data","Mystata.dta")
udata<-read.dta13(my)
str(udata)
## 'data.frame': 1244 obs. of 21 variables:
## $ Q1 : int 0 0 12 1 0 0 1 0 0 11 ...
## $ Q2 : int 1 1 2 2 2 2 2 2 2 1 ...
## $ Q3 : int 1 1 1 1 1 1 1 98 96 1 ...
## $ Q4 : int 2 2 1 3 1 4 4 98 3 3 ...
## $ Q5 : int 2 3 2 2 2 2 3 2 3 2 ...
## $ SEX : int 2 2 1 2 1 2 1 2 2 2 ...
## $ AGE : int 2 4 5 5 4 4 2 5 1 2 ...
## $ EDU : int 5 3 2 3 3 3 4 3 5 3 ...
## $ TOWNID : int 6305 6608 6302 911 904 6628 6304 6707 6303 6515 ...
## $ AREAR : int 1 4 1 5 5 4 1 5 1 2 ...
## $ SENGI : int 2 3 2 2 2 2 2 2 2 3 ...
## $ T_Cidentity: int 1 2 1 1 3 3 2 9 9 3 ...
## $ partyid : Factor w/ 8 levels "KMT","DPP","NP",..: 2 7 1 1 1 1 8 7 7 7 ...
## $ PARTY : int 5 99 2 3 2 2 99 99 99 99 ...
## $ tondu : int 3 4 2 4 4 2 3 9 3 4 ...
## $ tondu3 : int 2 2 1 2 2 1 2 9 2 2 ...
## $ peace : num 8 4 5 4 4 6 5 NA 7 8 ...
## $ visit : num 0 0 12 1 0 0 1 0 0 11 ...
## $ experience : num 1 1 0 0 0 0 0 0 0 1 ...
## $ tondu3_new : num 1 1 2 1 1 2 1 NA 1 1 ...
## $ consensus : num 2 4 4 3 4 3 2 NA 2 4 ...
## - attr(*, "datalabel")= chr ""
## - attr(*, "time.stamp")= chr " 7 Mar 2018 16:06"
## - attr(*, "formats")= chr "%10.0g" "%10.0g" "%10.0g" "%10.0g" ...
## - attr(*, "types")= int 65530 65530 65530 65530 65530 65530 65530 65530 65529 65530 ...
## - attr(*, "val.labels")= Named chr "" "" "" "" ...
## ..- attr(*, "names")= chr "" "" "" "" ...
## - attr(*, "var.labels")= chr "Q1" "Q2" "Q3" "Q4" ...
## - attr(*, "version")= int 118
## - attr(*, "label.table")=List of 4
## ..$ cat : Named int 1 2 3 4 5 6 7 9
## .. ..- attr(*, "names")= chr "KMT" "DPP" "NP" "PFP" ...
## ..$ yesno : Named int 0 1
## .. ..- attr(*, "names")= chr "No" "Yes"
## ..$ tondu3_new: Named int 1 2 3
## .. ..- attr(*, "names")= chr "Status quo" "Unification" "Independence"
## ..$ support : Named int 1 2 3 4
## .. ..- attr(*, "names")= chr "Strongly dissupport" "Dissupport" "Support" "Strongly support"
## - attr(*, "expansion.fields")= list()
## - attr(*, "byteorder")= chr "LSF"
## - attr(*, "orig.dim")= int 1244 21
convert.factors
這個參數控制是否將變數的值轉為因素,如果不轉為因素,則維持為整數或者數值。
udata2<-read.dta13(my, convert.factors=F)
class(udata$partyid); class(udata2$partyid)
[1] "factor" [1] "integer"
table(udata$partyid)
KMT DPP NP PFP TSU NPP
287 246 4 21 2 54
Independent DK 557 73
foreign
的套件也可以讀取SPSS的資料,使用read.spss():
library(foreign)
pp0797b2<-here::here('data','PP0797B2.sav')
dv<-read.spss(pp0797b2,
use.value.labels=F, to.data.frame=TRUE)
table(dv$Q1)
##
## 1 2 3 4 95 96 97 98
## 617 684 443 91 10 57 52 104
read.spss()
的內建值。
dv$Q1n <-c()
dv$Q1n[dv$Q1==1]<-'非常不同意'
dv$Q1n[dv$Q1==2]<-'不同意'
dv$Q1n[dv$Q1==3]<-'同意'
dv$Q1n[dv$Q1==4]<-'非常同意'
dv$Q1n=factor(dv$Q1n, levels=c('非常不同意','不同意','同意','非常同意'))
par(bg='lightblue', family='HanWangWCL07')
barplot(table(dv$Q1n), col='white')
Figure 2.3: 編碼標記圖形
udata1<-haven::read_sav(pp0797b2, encoding = 'UTF-8')
udata1[1:4, 1:3]
Q1 Q2 Q3
<dbl+lbl> <dbl+lbl> <dbl+lbl>
1 96 [很難說] 3 [同意] 2 [不同意]
2 1 [非常不同意] 4 [非常同意] 2 [不同意]
3 1 [非常不同意] 4 [非常同意] 1 [非常不同意] 4 3 [同意] 3 [同意] 2 [不同意]
pie(table(udata1$Q1))
Figure 3.1: 以haven套件讀取資料後的圓餅圖
第三個讀取 SPSS 資料的方法是先下載sjlabelled這個套件,然後用read_spss()函式來讀資料。這個方法可以讀取變量的中文標記,但是變數都是數值變數。有關sjlabelled
的功能,請參考這個套件的作者--Daniel L\(\rm{\ddot{u}}\)decke的網頁。
## [1] "非常不同意" "不同意" "既不同意也不反對" "同意"
## [5] "非常同意" "拒答" "看情形" "無意見"
## [9] "不知道"
#set_labels(udata4$Q7, labels='總統滿意度')
# set_labels(udata4$Q8, labels='政治興趣')
par(bg='#0022FF33')
barplot(table(sjlabelled::as_label(udata4$Q8)),
col='white', family='YouYuan', cex.names=0.8)
Figure 3.2: 以sjlabelled套件讀取資料後的直方圖
非常不滿意 | 不太滿意 | 有點滿意 | 非常滿意 | 拒答 | 看情形 | 無意見 | 不知道 | |
---|---|---|---|---|---|---|---|---|
完全沒興趣 | 97 | 92 | 71 | 7 | 7 | 5 | 27 | 29 |
不太有興趣 | 95 | 150 | 122 | 14 | 5 | 11 | 38 | 25 |
還算有興趣 | 55 | 69 | 84 | 16 | 2 | 2 | 10 | 4 |
非常有興趣 | 12 | 10 | 14 | 9 | 0 | 1 | 1 | 2 |
拒答 | 0 | 0 | 0 | 0 | 1 | 0 | 1 | 0 |
看情形 | 9 | 3 | 1 | 2 | 2 | 0 | 1 | 0 |
無意見 | 0 | 0 | 3 | 1 | 0 | 0 | 2 | 1 |
不知道 | 0 | 1 | 0 | 0 | 0 | 0 | 1 | 1 |
rio套件裡面有import()可以讀取網路的連結資料,讓使用者方便下載分析。例如在Github有很多資料,但是要注意是raw的格式。例如表3.2呈現我們讀取到的廣告公司資料:
library(rio)
qurl = 'https://raw.githubusercontent.com/TsaiChiahung/SocialStat2018/master/Data/Advertising.csv'
# download
tmp<-rio::import(qurl)
tmp_html<-knitr::kable(tmp, caption="廣告公司資料", format = 'html')
kableExtra::kable_styling(tmp_html,'striped',font_size = 20)
V1 | TV | radio | newspaper | sales |
---|---|---|---|---|
1 | 230.1 | 37.8 | 69.2 | 22.1 |
2 | 44.5 | 39.3 | 45.1 | 10.4 |
3 | 17.2 | 45.9 | 69.3 | 9.3 |
4 | 151.5 | 41.3 | 58.5 | 18.5 |
5 | 180.8 | 10.8 | 58.4 | 12.9 |
6 | 8.7 | 48.9 | 75.0 | 7.2 |
7 | 57.5 | 32.8 | 23.5 | 11.8 |
8 | 120.2 | 19.6 | 11.6 | 13.2 |
9 | 8.6 | 2.1 | 1.0 | 4.8 |
10 | 199.8 | 2.6 | 21.2 | 10.6 |
11 | 66.1 | 5.8 | 24.2 | 8.6 |
12 | 214.7 | 24.0 | 4.0 | 17.4 |
13 | 23.8 | 35.1 | 65.9 | 9.2 |
14 | 97.5 | 7.6 | 7.2 | 9.7 |
15 | 204.1 | 32.9 | 46.0 | 19.0 |
16 | 195.4 | 47.7 | 52.9 | 22.4 |
17 | 67.8 | 36.6 | 114.0 | 12.5 |
18 | 281.4 | 39.6 | 55.8 | 24.4 |
19 | 69.2 | 20.5 | 18.3 | 11.3 |
20 | 147.3 | 23.9 | 19.1 | 14.6 |
21 | 218.4 | 27.7 | 53.4 | 18.0 |
22 | 237.4 | 5.1 | 23.5 | 12.5 |
23 | 13.2 | 15.9 | 49.6 | 5.6 |
24 | 228.3 | 16.9 | 26.2 | 15.5 |
25 | 62.3 | 12.6 | 18.3 | 9.7 |
26 | 262.9 | 3.5 | 19.5 | 12.0 |
27 | 142.9 | 29.3 | 12.6 | 15.0 |
28 | 240.1 | 16.7 | 22.9 | 15.9 |
29 | 248.8 | 27.1 | 22.9 | 18.9 |
30 | 70.6 | 16.0 | 40.8 | 10.5 |
31 | 292.9 | 28.3 | 43.2 | 21.4 |
32 | 112.9 | 17.4 | 38.6 | 11.9 |
33 | 97.2 | 1.5 | 30.0 | 9.6 |
34 | 265.6 | 20.0 | 0.3 | 17.4 |
35 | 95.7 | 1.4 | 7.4 | 9.5 |
36 | 290.7 | 4.1 | 8.5 | 12.8 |
37 | 266.9 | 43.8 | 5.0 | 25.4 |
38 | 74.7 | 49.4 | 45.7 | 14.7 |
39 | 43.1 | 26.7 | 35.1 | 10.1 |
40 | 228.0 | 37.7 | 32.0 | 21.5 |
41 | 202.5 | 22.3 | 31.6 | 16.6 |
42 | 177.0 | 33.4 | 38.7 | 17.1 |
43 | 293.6 | 27.7 | 1.8 | 20.7 |
44 | 206.9 | 8.4 | 26.4 | 12.9 |
45 | 25.1 | 25.7 | 43.3 | 8.5 |
46 | 175.1 | 22.5 | 31.5 | 14.9 |
47 | 89.7 | 9.9 | 35.7 | 10.6 |
48 | 239.9 | 41.5 | 18.5 | 23.2 |
49 | 227.2 | 15.8 | 49.9 | 14.8 |
50 | 66.9 | 11.7 | 36.8 | 9.7 |
51 | 199.8 | 3.1 | 34.6 | 11.4 |
52 | 100.4 | 9.6 | 3.6 | 10.7 |
53 | 216.4 | 41.7 | 39.6 | 22.6 |
54 | 182.6 | 46.2 | 58.7 | 21.2 |
55 | 262.7 | 28.8 | 15.9 | 20.2 |
56 | 198.9 | 49.4 | 60.0 | 23.7 |
57 | 7.3 | 28.1 | 41.4 | 5.5 |
58 | 136.2 | 19.2 | 16.6 | 13.2 |
59 | 210.8 | 49.6 | 37.7 | 23.8 |
60 | 210.7 | 29.5 | 9.3 | 18.4 |
61 | 53.5 | 2.0 | 21.4 | 8.1 |
62 | 261.3 | 42.7 | 54.7 | 24.2 |
63 | 239.3 | 15.5 | 27.3 | 15.7 |
64 | 102.7 | 29.6 | 8.4 | 14.0 |
65 | 131.1 | 42.8 | 28.9 | 18.0 |
66 | 69.0 | 9.3 | 0.9 | 9.3 |
67 | 31.5 | 24.6 | 2.2 | 9.5 |
68 | 139.3 | 14.5 | 10.2 | 13.4 |
69 | 237.4 | 27.5 | 11.0 | 18.9 |
70 | 216.8 | 43.9 | 27.2 | 22.3 |
71 | 199.1 | 30.6 | 38.7 | 18.3 |
72 | 109.8 | 14.3 | 31.7 | 12.4 |
73 | 26.8 | 33.0 | 19.3 | 8.8 |
74 | 129.4 | 5.7 | 31.3 | 11.0 |
75 | 213.4 | 24.6 | 13.1 | 17.0 |
76 | 16.9 | 43.7 | 89.4 | 8.7 |
77 | 27.5 | 1.6 | 20.7 | 6.9 |
78 | 120.5 | 28.5 | 14.2 | 14.2 |
79 | 5.4 | 29.9 | 9.4 | 5.3 |
80 | 116.0 | 7.7 | 23.1 | 11.0 |
81 | 76.4 | 26.7 | 22.3 | 11.8 |
82 | 239.8 | 4.1 | 36.9 | 12.3 |
83 | 75.3 | 20.3 | 32.5 | 11.3 |
84 | 68.4 | 44.5 | 35.6 | 13.6 |
85 | 213.5 | 43.0 | 33.8 | 21.7 |
86 | 193.2 | 18.4 | 65.7 | 15.2 |
87 | 76.3 | 27.5 | 16.0 | 12.0 |
88 | 110.7 | 40.6 | 63.2 | 16.0 |
89 | 88.3 | 25.5 | 73.4 | 12.9 |
90 | 109.8 | 47.8 | 51.4 | 16.7 |
91 | 134.3 | 4.9 | 9.3 | 11.2 |
92 | 28.6 | 1.5 | 33.0 | 7.3 |
93 | 217.7 | 33.5 | 59.0 | 19.4 |
94 | 250.9 | 36.5 | 72.3 | 22.2 |
95 | 107.4 | 14.0 | 10.9 | 11.5 |
96 | 163.3 | 31.6 | 52.9 | 16.9 |
97 | 197.6 | 3.5 | 5.9 | 11.7 |
98 | 184.9 | 21.0 | 22.0 | 15.5 |
99 | 289.7 | 42.3 | 51.2 | 25.4 |
100 | 135.2 | 41.7 | 45.9 | 17.2 |
101 | 222.4 | 4.3 | 49.8 | 11.7 |
102 | 296.4 | 36.3 | 100.9 | 23.8 |
103 | 280.2 | 10.1 | 21.4 | 14.8 |
104 | 187.9 | 17.2 | 17.9 | 14.7 |
105 | 238.2 | 34.3 | 5.3 | 20.7 |
106 | 137.9 | 46.4 | 59.0 | 19.2 |
107 | 25.0 | 11.0 | 29.7 | 7.2 |
108 | 90.4 | 0.3 | 23.2 | 8.7 |
109 | 13.1 | 0.4 | 25.6 | 5.3 |
110 | 255.4 | 26.9 | 5.5 | 19.8 |
111 | 225.8 | 8.2 | 56.5 | 13.4 |
112 | 241.7 | 38.0 | 23.2 | 21.8 |
113 | 175.7 | 15.4 | 2.4 | 14.1 |
114 | 209.6 | 20.6 | 10.7 | 15.9 |
115 | 78.2 | 46.8 | 34.5 | 14.6 |
116 | 75.1 | 35.0 | 52.7 | 12.6 |
117 | 139.2 | 14.3 | 25.6 | 12.2 |
118 | 76.4 | 0.8 | 14.8 | 9.4 |
119 | 125.7 | 36.9 | 79.2 | 15.9 |
120 | 19.4 | 16.0 | 22.3 | 6.6 |
121 | 141.3 | 26.8 | 46.2 | 15.5 |
122 | 18.8 | 21.7 | 50.4 | 7.0 |
123 | 224.0 | 2.4 | 15.6 | 11.6 |
124 | 123.1 | 34.6 | 12.4 | 15.2 |
125 | 229.5 | 32.3 | 74.2 | 19.7 |
126 | 87.2 | 11.8 | 25.9 | 10.6 |
127 | 7.8 | 38.9 | 50.6 | 6.6 |
128 | 80.2 | 0.0 | 9.2 | 8.8 |
129 | 220.3 | 49.0 | 3.2 | 24.7 |
130 | 59.6 | 12.0 | 43.1 | 9.7 |
131 | 0.7 | 39.6 | 8.7 | 1.6 |
132 | 265.2 | 2.9 | 43.0 | 12.7 |
133 | 8.4 | 27.2 | 2.1 | 5.7 |
134 | 219.8 | 33.5 | 45.1 | 19.6 |
135 | 36.9 | 38.6 | 65.6 | 10.8 |
136 | 48.3 | 47.0 | 8.5 | 11.6 |
137 | 25.6 | 39.0 | 9.3 | 9.5 |
138 | 273.7 | 28.9 | 59.7 | 20.8 |
139 | 43.0 | 25.9 | 20.5 | 9.6 |
140 | 184.9 | 43.9 | 1.7 | 20.7 |
141 | 73.4 | 17.0 | 12.9 | 10.9 |
142 | 193.7 | 35.4 | 75.6 | 19.2 |
143 | 220.5 | 33.2 | 37.9 | 20.1 |
144 | 104.6 | 5.7 | 34.4 | 10.4 |
145 | 96.2 | 14.8 | 38.9 | 11.4 |
146 | 140.3 | 1.9 | 9.0 | 10.3 |
147 | 240.1 | 7.3 | 8.7 | 13.2 |
148 | 243.2 | 49.0 | 44.3 | 25.4 |
149 | 38.0 | 40.3 | 11.9 | 10.9 |
150 | 44.7 | 25.8 | 20.6 | 10.1 |
151 | 280.7 | 13.9 | 37.0 | 16.1 |
152 | 121.0 | 8.4 | 48.7 | 11.6 |
153 | 197.6 | 23.3 | 14.2 | 16.6 |
154 | 171.3 | 39.7 | 37.7 | 19.0 |
155 | 187.8 | 21.1 | 9.5 | 15.6 |
156 | 4.1 | 11.6 | 5.7 | 3.2 |
157 | 93.9 | 43.5 | 50.5 | 15.3 |
158 | 149.8 | 1.3 | 24.3 | 10.1 |
159 | 11.7 | 36.9 | 45.2 | 7.3 |
160 | 131.7 | 18.4 | 34.6 | 12.9 |
161 | 172.5 | 18.1 | 30.7 | 14.4 |
162 | 85.7 | 35.8 | 49.3 | 13.3 |
163 | 188.4 | 18.1 | 25.6 | 14.9 |
164 | 163.5 | 36.8 | 7.4 | 18.0 |
165 | 117.2 | 14.7 | 5.4 | 11.9 |
166 | 234.5 | 3.4 | 84.8 | 11.9 |
167 | 17.9 | 37.6 | 21.6 | 8.0 |
168 | 206.8 | 5.2 | 19.4 | 12.2 |
169 | 215.4 | 23.6 | 57.6 | 17.1 |
170 | 284.3 | 10.6 | 6.4 | 15.0 |
171 | 50.0 | 11.6 | 18.4 | 8.4 |
172 | 164.5 | 20.9 | 47.4 | 14.5 |
173 | 19.6 | 20.1 | 17.0 | 7.6 |
174 | 168.4 | 7.1 | 12.8 | 11.7 |
175 | 222.4 | 3.4 | 13.1 | 11.5 |
176 | 276.9 | 48.9 | 41.8 | 27.0 |
177 | 248.4 | 30.2 | 20.3 | 20.2 |
178 | 170.2 | 7.8 | 35.2 | 11.7 |
179 | 276.7 | 2.3 | 23.7 | 11.8 |
180 | 165.6 | 10.0 | 17.6 | 12.6 |
181 | 156.6 | 2.6 | 8.3 | 10.5 |
182 | 218.5 | 5.4 | 27.4 | 12.2 |
183 | 56.2 | 5.7 | 29.7 | 8.7 |
184 | 287.6 | 43.0 | 71.8 | 26.2 |
185 | 253.8 | 21.3 | 30.0 | 17.6 |
186 | 205.0 | 45.1 | 19.6 | 22.6 |
187 | 139.5 | 2.1 | 26.6 | 10.3 |
188 | 191.1 | 28.7 | 18.2 | 17.3 |
189 | 286.0 | 13.9 | 3.7 | 15.9 |
190 | 18.7 | 12.1 | 23.4 | 6.7 |
191 | 39.5 | 41.1 | 5.8 | 10.8 |
192 | 75.5 | 10.8 | 6.0 | 9.9 |
193 | 17.2 | 4.1 | 31.6 | 5.9 |
194 | 166.8 | 42.0 | 3.6 | 19.6 |
195 | 149.7 | 35.6 | 6.0 | 17.3 |
196 | 38.2 | 3.7 | 13.8 | 7.6 |
197 | 94.2 | 4.9 | 8.1 | 9.7 |
198 | 177.0 | 9.3 | 6.4 | 12.8 |
199 | 283.6 | 42.0 | 66.2 | 25.5 |
200 | 232.1 | 8.6 | 8.7 | 13.4 |
R
讓使用者處理資料之後輸出資料,讓其他使用者在其他平台使用。 write.table()可以匯出資料成為txt或是csv格式到指定的目錄,例如載入一個現有的檔案:
vshare<-here::here("data","voteshare")
vs<-scan(vshare, comment.char = '#', dec='.')
vs
## [1] 55.6 66.1 36.8 65.1 50.9 44.9 48.7 52.4 48.5 53.0 51.9
scan(vshare, comment.char = '#', dec='.')
## [1] 55.6 66.1 36.8 65.1 50.9 44.9 48.7 52.4 48.5 53.0 51.9
vsnew<-c(vs, 61.9, 31.8, 44.5)
vsnew
## [1] 55.6 66.1 36.8 65.1 50.9 44.9 48.7 52.4 48.5 53.0 51.9 61.9 31.8 44.5
write.table(vsnew,'vsnew.txt')
read.table('vsnew.txt')
## x
## 1 55.6
## 2 66.1
## 3 36.8
## 4 65.1
## 5 50.9
## 6 44.9
## 7 48.7
## 8 52.4
## 9 48.5
## 10 53.0
## 11 51.9
## 12 61.9
## 13 31.8
## 14 44.5
de<-data.frame(name=state.abb, region=state.region, area=state.area)
region.a<-substr(state.region, 1,1)
region.a
## [1] "S" "W" "W" "S" "W" "W" "N" "S" "S" "S" "W" "W" "N" "N" "N" "N" "S" "S" "N"
## [20] "S" "N" "N" "N" "S" "N" "W" "N" "W" "N" "N" "W" "N" "S" "N" "N" "S" "W" "N"
## [39] "N" "S" "N" "S" "S" "W" "N" "S" "W" "S" "N" "W"
de <- data.frame(de, region.short=as.factor(region.a))
head(de)
## name region area region.short
## 1 AL South 51609 S
## 2 AK West 589757 W
## 3 AZ West 113909 W
## 4 AR South 53104 S
## 5 CA West 158693 W
## 6 CO West 104247 W
write.csv(de, 'state.csv', row.names = F)
state<-read.csv('state.csv', header=TRUE)
head(state)
## name region area region.short
## 1 AL South 51609 S
## 2 AK West 589757 W
## 3 AZ West 113909 W
## 4 AR South 53104 S
## 5 CA West 158693 W
## 6 CO West 104247 W
R
有 global 這個環境空間中儲存命令列中所建立的任何變數,若要了解 global 環境空間有哪些物件,可以使用globalenv() 這個函數,:
globalenv()
## <environment: R_GlobalEnv>
ls(envir = globalenv(),10)
## [1] "crx" "CSV" "csv1" "csv2" "csvdata" "data"
## [7] "de" "df" "df1" "dt" "Dta" "dv"
## [13] "file" "lattice" "my" "ndt" "p" "pp0797b2"
## [19] "PP1697C1" "qurl" "region.a" "state" "students" "ten"
## [25] "tmp" "tmp_html" "udata" "udata1" "udata2" "udata4"
## [31] "vs" "vshare" "vsnew"
R
是直接可見的。但是attach無法儲存更改後的資料,因此要記得匯出資料,或者是用語法紀錄。例如:
head(csv2)
## Year budget unit contracter open
## 1 2015 676 水利處 台球 Yes
## 2 2016 673 新建工程處 茂盛 Yes
## 3 2016 270 新建工程處 冠君 Yes
## 4 2016 255 新建工程處 金煌 Yes
## 5 2016 235 新建工程處 聖鋒 Yes
## 6 2016 190 新建工程處 福呈 No
attach(csv2)
contracter
## [1] "台球" "茂盛" "冠君" "金煌" "聖鋒" "福呈" "盛吉" "茂盛"
## [9] "冠君" "未發包"
contracter[1]<-"未發包"
csv2$contracter[10]<-"台球"
csv2
## Year budget unit contracter open
## 1 2015 676 水利處 台球 Yes
## 2 2016 673 新建工程處 茂盛 Yes
## 3 2016 270 新建工程處 冠君 Yes
## 4 2016 255 新建工程處 金煌 Yes
## 5 2016 235 新建工程處 聖鋒 Yes
## 6 2016 190 新建工程處 福呈 No
## 7 2015 155 公園處 盛吉 Yes
## 8 2016 154 新建工程處 茂盛 Yes
## 9 2016 142 新建工程處 冠君 Yes
## 10 2016 123 新建工程處 台球 Yes
detach(csv2)
csv2
## Year budget unit contracter open
## 1 2015 676 水利處 台球 Yes
## 2 2016 673 新建工程處 茂盛 Yes
## 3 2016 270 新建工程處 冠君 Yes
## 4 2016 255 新建工程處 金煌 Yes
## 5 2016 235 新建工程處 聖鋒 Yes
## 6 2016 190 新建工程處 福呈 No
## 7 2015 155 公園處 盛吉 Yes
## 8 2016 154 新建工程處 茂盛 Yes
## 9 2016 142 新建工程處 冠君 Yes
## 10 2016 123 新建工程處 台球 Yes
上面的例子顯示,如果只是更改向量的元素,而不是更改資料框加上向量的元素,那麼並不會真正改變資料框的內容,而一旦更動,即使detach()
該資料集,也會維持其變動。
detach()
:從工作環境移除已經附加的資料框、向量,以避免混淆。
rm(list=ls())
:從工作環境移除所有的向量、列表、資料框等等。
rm()
:刪除特定的向量、列表、資料框等等。
save.image()
:儲存環境空間內所有的資料與結果。
load()
:下載所有資料與結果。
rm(list=ls()) #remove all data
data(mtcars) #suppose we analyze mtcars
m1<-lm(mpg ~ cyl, data=mtcars) #regression
summary(m1) #results
##
## Call:
## lm(formula = mpg ~ cyl, data = mtcars)
##
## Residuals:
## Min 1Q Median 3Q Max
## -4.981 -2.119 0.222 1.072 7.519
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 37.885 2.074 18.27 < 2e-16 ***
## cyl -2.876 0.322 -8.92 6.1e-10 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 3.21 on 30 degrees of freedom
## Multiple R-squared: 0.726, Adjusted R-squared: 0.717
## F-statistic: 79.6 on 1 and 30 DF, p-value: 6.11e-10
mydata<-data.frame(date=as.Date(c("2018-03-13",
"2018-03-14","2018-03-15"),
format='%Y-%m-%d'),
workinghours=c(4, 3, 4)) #create your own data
save.image("test.Rdata") #save all results to Rdata
rm(list=ls()) #remove all data
load("test.Rdata") #load Rdata
ls(envir = globalenv(),10) #display objects in this environment
## [1] "m1" "mtcars" "mydata"
mydata #diplay your data
## date workinghours
## 1 2018-03-13 4
## 2 2018-03-14 3
## 3 2018-03-15 4
saveRds()
:儲存成RDSsaveRDS()
。如果不想儲存原來的物件名稱,也可以考慮saveRDS()
。
vshare<-here::here('data','voteshare')
vs<-scan(vshare, comment.char = '#', dec='.')
vs
[1] 55.6 66.1 36.8 65.1 50.9 44.9 48.7 52.4 48.5 53.0 51.9
vs2<-vs/100
saveRDS(vs, "vs.rds")
saveRDS(vs2, 'vs2.rds')
rm(vs); rm(vs2)
vs<-readRDS('vs.rds')
vs2<-readRDS('vs2.rds')
vs; vs2
[1] 55.6 66.1 36.8 65.1 50.9 44.9 48.7 52.4 48.5 53.0 51.9 [1] 0.556 0.661 0.368 0.651 0.509 0.449 0.487 0.524 0.485 0.530 0.519
saveRDS()
的優點是雖然一次只儲存一個物件,但是藉由儲存,可以避免新的物件蓋過舊的物件,新舊物件可以並存。
print()
:顯示資料框、向量、列表等等,但是無法附加上文字。
source()
:R
可以讀取既有指令的檔案,在不必開啟命令稿的情況下直接執行多行程式,可節省許多篇幅以及時間。例如我們寫一個自訂函數,語法很長,我們先存成一個語法檔,未來可以直接執行。
sink("twohistograms.R") #define a new script file
cat("set.seed(02138)") #input a function that sets starting number for random number
cat("\n") #end of line
cat("#write R script to a file without opening a document")
cat("\n") #end of line
cat("fnorm<-function(mu){ #create a function with a parameter: mu
sample.o<-rnorm(20,mu,1/sqrt(mu)) #define the 1st vector that generates random numbers
sample.i<-sample.o+runif(1,0,10) #define the 2nd vector that generates random numbers
par(mfrow=c(1,2)) #set parameter of graphic for 1*2 graphics
hist(sample.o, col=1, main='', #histogram with Basic R
xlab='Original sample')
hist(sample.i, col=4, main='', #another histogram
xlab='Original sample + random number')
}")
cat("\n") #end of function
sink() #save the script in the specified file
file.show("twohistograms.R") #Opening an editor to show the script
我們建立 fnorm()這個函數,並且存成一個語法檔("twohistograms.R"),並且用file.show()
顯示出來。以後就可以執行它。
使用source()
函數,執行"twohistograms.R"此一語法檔,產生一個自訂函數,然後輸入參數便可顯示結果。請執行上面的指令之後,自行輸入以下兩行語法:
source("twohistograms.R")
fnorm(1)
如果執行成功會看到以下圖形:
確定一下工作目錄的確多了"twohistograms.R"此一語法檔。
with()
:當環境空間有一個以上的資料框,為了避免混淆,可以使用該指令進行分析:
par(mfrow=c(1,2))
library(car)
with(Duncan, hist(income, col=2))
with(Salaries, hist(salary, col=6))
Figure 5.1: 兩個變數名稱相似的長條圖
names()
:顯示資料框的變數名稱,例如:
names(mtcars)
[1] "mpg" "cyl" "disp" "hp" "drat" "wt" "qsec" "vs" "am" "gear" [11] "carb"
注意,該指令不適用於矩陣,例如state.x77。
which()
:顯示特定變數。例如,哪些樹的圓周符合條件:
which(Orange$circumference>100)
[1] 4 5 6 7 10 11 12 13 14 18 19 20 21 24 25 26 27 28 32 33 34 35
看起來有相當多的樹木胸圍超過100公釐(10公分),但是到底有哪些樹符合這個條件?可應用which()
函數加以篩選:
oc<-which(Orange$circumference>100) #create a vector
#of data that meets a condition
oc
## [1] 4 5 6 7 10 11 12 13 14 18 19 20 21 24 25 26 27 28 32 33 34 35
Orange[oc,] #match data with the vector
## Tree age circumference
## 4 1 1004 115
## 5 1 1231 120
## 6 1 1372 142
## 7 1 1582 145
## 10 2 664 111
## 11 2 1004 156
## 12 2 1231 172
## 13 2 1372 203
## 14 2 1582 203
## 18 3 1004 108
## 19 3 1231 115
## 20 3 1372 139
## 21 3 1582 140
## 24 4 664 112
## 25 4 1004 167
## 26 4 1231 179
## 27 4 1372 209
## 28 4 1582 214
## 32 5 1004 125
## 33 5 1231 142
## 34 5 1372 174
## 35 5 1582 177
oc是滿足樹的圓周超過100公釐的觀察值,而以該資料框配對這些觀察值,只留下可以配對的每一列觀察值。
rep(A, n)
:重複A數值或者字串n次:<ㄥli>
rep(3, 5)
## [1] 3 3 3 3 3
c(rep("大", 3), rep("中", 1), rep("小",2))
## [1] "大" "大" "大" "中" "小" "小"
seq(i,j)
:傳回i到j的連續數字:
seq(1,10)
## [1] 1 2 3 4 5 6 7 8 9 10
seq(100,110, by=2)
## [1] 100 102 104 106 108 110
seq(i:j)
:傳回i到j的順位數字<ㄥli>
seq(5:10)
## [1] 1 2 3 4 5 6
seq(100:110)
## [1] 1 2 3 4 5 6 7 8 9 10 11
grep()
:傳回字串向量或資料中符合條件的元素或所在的列。例如我們有一個字串是拉脫維亞的城市名稱,我們想知道哪幾個城市有pils這幾個字:
latvija<-c("Daugavpils","Jēkabpils","Jelgava
Liepāja","Rēzekne","Rīga","Valmiera",
"Ventspils")
grep("pils", latvija)
## [1] 1 2 7
latvija[grep("pils", latvija)]
## [1] "Daugavpils" "Jēkabpils" "Ventspils"
還記得之前使用的政府開放資料嗎?假設我們想篩選出「區」的資料:
open<-here::here('data','opendata106N0101.csv')
dat<-read.csv(open, header=T)
district<-dat[grep("區", dat$code), ]
head(dat, n=3)
code 年底人口數 土地面積 人口密度
1 新北市板橋區 551480 23.14 23835 2 新北市三重區 387484 16.32 23747 3 新北市中和區 413590 20.14 20532
可以應用在列表資料,假設我們有一筆資料是電視頻道的屬性:
L <- list(a<-c('lecture', 'movie'), b<-c('Movie channel'), c=c(1:10),
d<-c('movie','food', "news",'car','music'))
match.s<-grep('movie', L) ; match.s
[1] 1 4
L[grep('movie', L)]
[[1]][1] "lecture" "movie"
[[2]][1] "movie" "food" "news" "car" "music"
gsub()
:取代符合條件的字串。以上述為例,假設我們想把「臺」一律改為「台」,則可以這樣做:
library(tidyverse)
#dat2 <-dat[grep("臺", dat$code), ]
dat2 <- dat%>% mutate(code=gsub("臺", "台", dat$code))
dat2[grep('台北市', dat2$code), ]
## code 年底人口數 土地面積 人口密度
## 30 台北市松山區 206988 9.288 22286
## 31 台北市信義區 225753 11.208 20143
## 32 台北市大安區 309969 11.361 27283
## 33 台北市中山區 230710 13.682 16862
## 34 台北市中正區 159608 7.607 20981
## 35 台北市大同區 129278 5.681 22754
## 36 台北市萬華區 191850 8.852 21673
## 37 台北市文山區 274424 31.509 8709
## 38 台北市南港區 122155 21.842 5593
## 39 台北市內湖區 287771 31.579 9113
## 40 台北市士林區 288295 62.368 4622
## 41 台北市北投區 256456 56.822 4513
substr()
:擷取符合起始與結束字元的字串。例如在上述資料中,我們想建立一個縣市的類別變數:
dat<-read.csv(open, header=T, stringsAsFactors = F)
dat2 <- dat%>% dplyr::mutate(city=substr(dat$code, 1,3))
head(dat2, n=3)
## code 年底人口數 土地面積 人口密度 city
## 1 新北市板橋區 551480 23.14 23835 新北市
## 2 新北市三重區 387484 16.32 23747 新北市
## 3 新北市中和區 413590 20.14 20532 新北市
以下練習取出各個鄉鎮市區所屬的縣市,去掉東沙、南沙群島,排序,然後按照縣市的土地面積大小順序畫圖 5.2:
Figure 5.2: 各縣市土地面積
☛請練習畫圖表示各縣市的人口數統計(提示,用轉換字串的年底人口數變成數值)
sub()
或gsub()
:取代指定的字串,例如:
country<-c( "United States", "Republic of Kenya", "Republic of Korea")
sub('Republic of', '', country)
[1] "United States" " Kenya" " Korea"
因為gsub()會替換所有符合條件的字串,所以比sub()好用,例如:
U<-matrix(c('文殊蘭花與蝴蝶蘭花','茶花','杜鵑花',
'玫瑰花','菊花','蘭花'), nrow=3, ncol=2)
U
[,1] [,2]
[1,] "文殊蘭花與蝴蝶蘭花" "玫瑰花" [2,] "茶花" "菊花"
[3,] "杜鵑花" "蘭花"
sub('蘭花','蘭', U)
[,1] [,2]
[1,] "文殊蘭與蝴蝶蘭花" "玫瑰花" [2,] "茶花" "菊花"
[3,] "杜鵑花" "蘭"
gsub('蘭花','蘭', U)
[,1] [,2]
[1,] "文殊蘭與蝴蝶蘭" "玫瑰花" [2,] "茶花" "菊花"
[3,] "杜鵑花" "蘭"
zodiac<-c( "(mouse)", "(ox)", "(tiger)", "(rabbit)", "(dragon)")
zodiac<-sub("\\(","", zodiac)
sub("\\)","", zodiac)
[1] "mouse" "ox" "tiger" "rabbit" "dragon"
回到剛剛國家名稱的例子:
country<-c( "United States", "Republic of Kenya", "Republic of Korea")
country<-c("People's Republic of China
Democratic Republic of Congo",
"United States",
"Republic of Kenya", "Republic of Korea",
"Democratic People's Republic of Korea")
country[grep('^Republic of', country)]
[1] "Republic of Kenya" "Republic of Korea"
如果我們想刪掉"Republic of",可以這樣做:
gsub("^Republic of", "", country)
[1] "People's Republic of ChinaDemocratic Republic of Congo" [2] "United States"
[3] " Kenya"
[4] " Korea"
[5] "Democratic People's Republic of Korea"
這個表示方式叫做正規表示式。對於其他設定有興趣的同學可參考Larry Lu的網頁。
strsplit()
是能夠將一個文字切割成向量的函數,例如我們想知道美國獨立宣言中用了哪些字最多次,我們可以分割整篇文字為一個個的字串,然後加以統計:
document <- c("When in the Course of human events, it becomes necessary for one people to dissolve the political bands which have connected them with another, and to assume among the powers of the earth, the separate and equal station to which the Laws of Nature and of Nature's God entitle them, a decent respect to the opinions of mankind requires that they should declare the causes which impel them to the separation. We hold these truths to be self-evident, that all men are created equal, that they are endowed by their Creator with certain unalienable Rights, that among these are Life, Liberty and the pursuit of Happiness.--That to secure these rights, Governments are instituted among Men, deriving their just powers from the consent of the governed, --That whenever any Form of Government becomes destructive of these ends, it is the Right of the People to alter or to abolish it, and to institute new Government, laying its foundation on such principles and organizing its powers in such form, as to them shall seem most likely to effect their Safety and Happiness. Prudence, indeed, will dictate that Governments long established should not be changed for light and transient causes; and accordingly all experience hath shewn, that mankind are more disposed to suffer, while evils are sufferable, than to right themselves by abolishing the forms to which they are accustomed. But when a long train of abuses and usurpations, pursuing invariably the same Object evinces a design to reduce them under absolute Despotism, it is their right, it is their duty, to throw off such Government, and to provide new Guards for their future security.--Such has been the patient sufferance of these Colonies; and such is now the necessity which constrains them to alter their former Systems of Government. The history of the present King of Great Britain is a history of repeated injuries and usurpations, all having in direct object the establishment of an absolute Tyranny over these States. To prove this, let Facts be submitted to a candid world. He has refused his Assent to Laws, the most wholesome and necessary for the public good. He has forbidden his Governors to pass Laws of immediate and pressing importance, unless suspended in their operation till his Assent should be obtained; and when so suspended, he has utterly neglected to attend to them. He has refused to pass other Laws for the accommodation of large districts of people, unless those people would relinquish the right of Representation in the Legislature, a right inestimable to them and formidable to tyrants only. He has called together legislative bodies at places unusual, uncomfortable, and distant from the depository of their public Records, for the sole purpose of fatiguing them into compliance with his measures. He has dissolved Representative Houses repeatedly, for opposing with manly firmness his invasions on the rights of the people. He has refused for a long time, after such dissolutions, to cause others to be elected; whereby the Legislative powers, incapable of Annihilation, have returned to the People at large for their exercise; the State remaining in the mean time exposed to all the dangers of invasion from without, and convulsions within. He has endeavoured to prevent the population of these States; for that purpose obstructing the Laws for Naturalization of Foreigners; refusing to pass others to encourage their migrations hither, and raising the conditions of new Appropriations of Lands. He has obstructed the Administration of Justice, by refusing his Assent to Laws for establishing Judiciary powers. He has made Judges dependent on his Will alone, for the tenure of their offices, and the amount and payment of their salaries. He has erected a multitude of New Offices, and sent hither swarms of Officers to harrass our people, and eat out their substance. He has kept among us, in times of peace, Standing Armies without the Consent of our legislatures. He has affected to render the Military independent of and superior to the Civil power. He has combined with others to subject us to a jurisdiction foreign to our constitution, and unacknowledged by our laws; giving his Assent to their Acts of pretended Legislation: For Quartering large bodies of armed troops among us: For protecting them, by a mock Trial, from punishment for any Murders which they should commit on the Inhabitants of these States: For cutting off our Trade with all parts of the world: For imposing Taxes on us without our Consent: For depriving us in many cases, of the benefits of Trial by Jury: For transporting us beyond Seas to be tried for pretended offences. For abolishing the free System of English Laws in a neighbouring Province, establishing therein an Arbitrary government, and enlarging its Boundaries so as to render it at once an example and fit instrument for introducing the same absolute rule into these Colonies: For taking away our Charters, abolishing our most valuable Laws, and altering fundamentally the Forms of our Governments: For suspending our own Legislatures, and declaring themselves invested with power to legislate for us in all cases whatsoever. He has abdicated Government here, by declaring us out of his Protection and waging War against us. He has plundered our seas, ravaged our Coasts, burnt our towns, and destroyed the lives of our people. He is at this time transporting large Armies of foreign Mercenaries to compleat the works of death, desolation and tyranny, already begun with circumstances of Cruelty & perfidy scarcely paralleled in the most barbarous ages, and totally unworthy the Head of a civilized nation. He has constrained our fellow Citizens taken Captive on the high Seas to bear Arms against their Country, to become the executioners of their friends and Brethren, or to fall themselves by their Hands. He has excited domestic insurrections amongst us, and has endeavoured to bring on the inhabitants of our frontiers, the merciless Indian Savages, whose known rule of warfare, is an undistinguished destruction of all ages, sexes and conditions. In every stage of these Oppressions We have Petitioned for Redress in the most humble terms: Our repeated Petitions have been answered only by repeated injury. A Prince whose character is thus marked by every act which may define a Tyrant, is unfit to be the ruler of a free people. Nor have We been wanting in attentions to our Brittish brethren. We have warned them from time to time of attempts by their legislature to extend an unwarrantable jurisdiction over us. We have reminded them of the circumstances of our emigration and settlement here. We have appealed to their native justice and magnanimity, and we have conjured them by the ties of our common kindred to disavow these usurpations, which, would inevitably interrupt our connections and correspondence. They too have been deaf to the voice of justice and of consanguinity. We must, therefore, acquiesce in the necessity, which denounces our Separation, and hold them, as we hold the rest of mankind, Enemies in War, in Peace Friends. We, therefore, the Representatives of the united States of America, in General Congress, Assembled, appealing to the Supreme Judge of the world for the rectitude of our intentions, do, in the Name, and by Authority of the good People of these Colonies, solemnly publish and declare, That these United Colonies are, and of Right ought to be Free and Independent States; that they are Absolved from all Allegiance to the British Crown,
")
doc<-c("and that all political connection between them and the State of Great Britain, is and ought to be totally dissolved; and that as Free and Independent States, they have full Power to levy War, conclude Peace, contract Alliances, establish Commerce, and to do all other Acts and Things which Independent States may of right do. And for the support of this Declaration, with a firm reliance on the protection of divine Providence, we mutually pledge to each other our Lives, our Fortunes and our sacred Honor.")
tmp<-strsplit(c(document,doc), split=" ")
分割文本為個別的字串後,就可以計算有興趣的文字出現幾次。
## # A tibble: 598 x 2
## W Count
## <fct> <int>
## 1 the 73
## 2 of 72
## 3 to 60
## 4 and 47
## 5 our 22
## 6 has 20
## 7 their 20
## 8 for 19
## 9 he 19
## 10 in 18
## # … with 588 more rows
cat()
:顯示向量以及運算結果,並可以加上文字,並且用"斜線n"參數換行:x<-c(2,4,6)
cat(x, "\n");
2 4 6
cat("summation:", sum(x), "\n", "average:", mean(x))
summation: 12 average: 4
read.csv()
。
db <- data.table::data.table(salary=c('42,000','55,000','45,000','66,000', '65,000'),
years=c(3,4,3,5,5), bonus=c(5000,4000,5000,6000,5000))
db
## salary years bonus
## 1: 42,000 3 5000
## 2: 55,000 4 4000
## 3: 45,000 3 5000
## 4: 66,000 5 6000
## 5: 65,000 5 5000
最後更新日期 02/22/2021