本週上課將介紹R
的資料型態與資料結構,也就是R
可以讀取並進行運算的對象,例如我們用 c()
來表示數字、文字的集合:
A<-c("台北市","新北市", "桃園市", "台中市","台南市","高雄市")
print(A)
[1] “台北市” “新北市” “桃園市” “台中市” “台南市” “高雄市”
B<-c(0,1,2,3,4,5,6,7,8,9)
print(B)
[1] 0 1 2 3 4 5 6 7 8 9 A是文字而B是數字的集合,另外還有邏輯以及日期等常用的資料型態。
R
資料結構可分為一維、二維、多維:
由此出發,了解R
的資料型態以及運算。
R
的資料型態之前,先來認識R
的函數。
R
是物件導向(object-oriented)的語言,所謂物件導向,就是使用者告訴系統某個函式、向量、矩陣、迴圈等等的名稱,或者包含程式與資料在一個物件,系統就會儲存在工作環境中,使用者可以呼叫執行或者修改。使用者可以以堆積木的方式,把物件堆疊起完成希望得到的結果。
R
的環境(environment)中,有各種物件,可以用ls()這個函式顯示。例如:
[1] “a” “A” “b” “B” “f”
[1] 4 [1] 25
R
的平均值函數為:
?mean
mean(x, trim = 0, na.rm = FALSE)
x
為變數,trim
表示設定要去除多少百分比的變數中的資料。例如有一個變數型態為:
x<-c(100000, 10000000, c(1:10)); sort(x)
[1] 1e+00 2e+00 3e+00 4e+00 5e+00 6e+00 7e+00 8e+00 9e+00 1e+01 1e+05 1e+07
x<-c(100000, 10000000, c(1:10))
sort(x)
[1] 1e+00 2e+00 3e+00 4e+00 5e+00 6e+00 7e+00 8e+00 9e+00 1e+01 1e+05 1e+07
mean(x)
[1] 841671
mean(x, trim=0.1)
[1] 10005
mean(sort(x)[2:11])
[1] 10005
mean(x, trim=0.2)
[1] 6.5
mean(sort(x)[3:10])
[1] 6.5
X<-c(2, 4, 6, 8); X
[1] 2 4 6 8
class(X)
[1] “numeric”
str(X)
num [1:4] 2 4 6 8
y=c(1.1e+06); y
[1] 1100000
class(y)
[1] “numeric”
u<-as.integer(c(4)); class(u)
[1] “integer”
a<-c(7, 8.5, 9); class(a)
[1] “numeric”
b<-as.integer(a); b
[1] 7 8 9
R
會顯示變數性質與對應結果:
h<-c(100, 200, 500)
ok<-h>300
b[ok]
[1] 9
ok<-h>1000
b[ok]
integer(0)
state.abb
class(state.abb)
Figure 2.1: 美國各州人口數的點狀圖
Figure 2.2: 各州人口排序後的點狀圖
state.x77
是矩陣,所以取出這個矩陣中的人口此一欄位,然後與state.abb
結合成一個資料框,再以dotplot
或者dotchart
指令畫成點狀圖。
char1<-c("1","2","3","4","5"); char1
[1] “1” “2” “3” “4” “5”
char2<-c(1, 2, "文字"); char2
[1] “1” “2” “文字”
as.numeric
轉換為數字,但是可以用語法進行轉換(請見因素一節)。
library(foreign); library(tidyverse)
file<-here::here('data','opendata106N0101.csv')
opendf<-read.csv(file, header=T,
sep=',')
str(opendf)
## 'data.frame': 375 obs. of 4 variables:
## $ code : chr "新北市板橋區" "新北市三重區" "新北市中和區" "新北市永和區" ...
## $ 年底人口數: chr "551480" "387484" "413590" "222585" ...
## $ 土地面積 : num 23.14 16.32 20.14 5.71 19.74 ...
## $ 人口密度 : chr "23835" "23747" "20532" "38956" ...
file<-here::here('data','opendata106N0101.csv')
dat<-read.csv(file, header=T, stringsAsFactors = F)
nrow(dat) #check how many rows; n=375
[1] 375
dat <- dat[-c(369:375),] #delete the rows of small islands and notes
head(dat, n=3)
code 年底人口數 土地面積 人口密度
1 新北市板橋區 551480 23.14 23835
2 新北市三重區 387484 16.32 23747
3 新北市中和區 413590 20.14 20532
dat <- dat %>% mutate(popu=as.numeric(年底人口數))
str(dat)
## 'data.frame': 368 obs. of 5 variables:
## $ code : chr "新北市板橋區" "新北市三重區" "新北市中和區" "新北市永和區" ...
## $ 年底人口數: chr "551480" "387484" "413590" "222585" ...
## $ 土地面積 : num 23.14 16.32 20.14 5.71 19.74 ...
## $ 人口密度 : chr "23835" "23747" "20532" "38956" ...
## $ popu : num 551480 387484 413590 222585 416524 ...
x=c("Yes","No","No","Yes","Yes"); x
[1] “Yes” “No” “No” “Yes” “Yes”
factor(x)
[1] Yes No No Yes Yes Levels: No Yes
table(x)
x
No Yes
2 3
factor()
這個函數把x轉換為因素,有No, Yes兩類別。
library(car)
kableExtra::kable_styling(knitr::kable(table(Chile$sex, Chile$vote)))
A | N | U | Y | |
---|---|---|---|---|
F | 104 | 363 | 362 | 480 |
M | 83 | 526 | 226 | 388 |
Chile$ncode<-as.numeric(Chile$region)
kableExtra::kable_styling(knitr::kable(table(Chile$ncode, Chile$vote)))
A | N | U | Y |
---|---|---|---|
44 | 210 | 141 | 174 |
2 | 18 | 23 | 38 |
30 | 102 | 46 | 135 |
42 | 214 | 148 | 275 |
69 | 345 | 230 | 246 |
library(lattice)
plot(Chile$sex, Chile$vote, xlab="Sex", ylab="Vote")
Figure 2.3: 性別與投票之一
gender<-as.numeric(Chile$sex)
kableExtra::kable_styling(knitr::kable(table(gender)))
gender | Freq |
---|---|
1 | 1379 |
2 | 1321 |
R
按照類別的字母順序轉換類別為數字。如果進一步要轉換數字就容易了:
sex <- c()
sex[gender==2]<-0
sex[gender==1]<-1
kableExtra::kable_styling(knitr::kable(table(sex)))
sex | Freq |
---|---|
0 | 1321 |
1 | 1379 |
ngender<-c()
ngender[Chile$sex=='F']<-1
ngender[Chile$sex=='M']<-0
kableExtra::kable_styling(knitr::kable(table(ngender)))
ngender | Freq |
---|---|
0 | 1321 |
1 | 1379 |
Chile$gender[Chile$sex=="F"]<-"Female"
Chile$gender[Chile$sex=="M"]<-"Male"
class(Chile$gender)
[1] “character”
kableExtra::kable_styling(knitr::kable(table(Chile$gender)))
Var1 | Freq |
---|---|
Female | 1379 |
Male | 1321 |
A | N | U | Y | |
---|---|---|---|---|
Female | 104 | 363 | 362 | 480 |
Male | 83 | 526 | 226 | 388 |
Figure 2.4: 性別與投票之二
R
無法判斷哪一個字串應該被給予哪一個數字。
as.numeric(Chile$gender)
☛請嘗試練習AMSsurvey
的citizen
等類別變數的轉換。
Var1 | Freq |
---|---|
82 | 3 |
84 | 4 |
86 | 3 |
88 | 4 |
90 | 3 |
92 | 4 |
93 | 1 |
x<-c("花蓮縣","臺北市","屏東縣","臺南市","高雄市");x
table(x)
level
指令
\(\texttt{xf<-factor(x, levels=c("臺北市", "臺南市","高雄市","屏東縣","花蓮縣"))}\)
xf | Freq |
---|---|
臺北市 | 0 |
臺南市 | 0 |
高雄市 | 0 |
屏東縣 | 0 |
花蓮縣 | 0 |
factor()
這個函數裡面有ordered
的邏輯選項,不過只要指定levels
,有無ordered
為真並不影響。但是ordered()
這個函式會得到一個已經排序的因素,例如:
od<-ordered(1:20); class(od)
[1] “ordered” “factor”
a<-c(0:9); a
[1] 0 1 2 3 4 5 6 7 8 9
ok<-a>5; ok
[1] FALSE FALSE FALSE FALSE FALSE FALSE TRUE TRUE TRUE TRUE
a[ok]
[1] 6 7 8 9
☛請執行以下語法,並且回答篩選後的變數剩下幾個觀察值?
head(Duncan)
ok<-Duncan$income>50
Duncan$income[ok]
library(data.table)
H<-data.table(Age=c(NA,"30-39","40-49","20-29",
"20-29","60-69","60-69","30-39","60-69",
"30-39","20-29","20-29","30-39","40-49",
"40-49", "40-49","50-59","50-59","20-29",NA),
Vote=c("Ding", NA, "Ko", "Ko", "Ko",
"Ding","Ding",NA,NA, "Ko","Yao","Yao", "Yao","Ding","Ko","Ko","Yao","Yao","Ding","Ko"),
pride=c(3,NA, 7, 3, NA, 5, 5, 4,
NA,2,5, 1,6,8, 7, 6, 1, 3,3,5))
ok | Freq |
---|---|
FALSE | 3 |
TRUE | 17 |
[1] NA [1] 4.353
上述的例子是如果沒有回答pride,如果我們想顯示H資料中年齡與投票之間的關係,有無設定邏輯去掉遺漏值會造成影響嗎?
Ding | Ko | Yao | |
---|---|---|---|
20-29 | 1 | 2 | 2 |
30-39 | 0 | 1 | 1 |
40-49 | 1 | 3 | 0 |
50-59 | 0 | 0 | 2 |
60-69 | 2 | 0 | 0 |
Ding | Ko | Yao | |
---|---|---|---|
20-29 | 1 | 2 | 2 |
30-39 | 0 | 1 | 1 |
40-49 | 1 | 3 | 0 |
50-59 | 0 | 0 | 2 |
60-69 | 2 | 0 | 0 |
R
會自動去掉無法列表的遺漏值。但是我們交叉其他變數,就會發現差異。
as.Date()
可以將字串轉變為日期資料。
v<-c("2/27/2018", "6/26/2018", "12/31/2018"); class(v)
[1] “character”
v.date<-as.Date(v, format='%m/%d/%Y'); class(v.date)
[1] “Date”
或者是
v<-c("", "6/26/2018", "12/31/2018")
as.Date(v, format='%m/%d/%Y')
[1] NA “2018-06-26” “2018-12-31”
format()
則轉換屬性為日期的資料為不同格式,例如:
today <- Sys.Date()
format(today, format='%m/%d/%Y')
[1] “03/19/2021”
符號 | 意義 | 例子 |
---|---|---|
%d | 日 | 01-31 |
%a | 星期幾的縮寫 | Mon |
%A | 星期幾 | Monday |
%m | 月份(數字) | 01-12 |
%b | 月份的縮寫 | Jan |
%B | 月份的完整寫法 | January |
%y | 兩位數年份 | 18 |
%Y | 年份 | 2018 |
format()
這個函式轉換已經是日期格式的資料,例如:
Today<-Sys.Date(); Today
[1] “2021-03-19”
today_format1<-format(Today, format='%Y-%b-%d'); today_format1
[1] “2021- 3-19”
today_format2<-format(Today, format='%b/%d/%y'); today_format2
[1] " 3/19/21"
today_format3<-format(Today, format='%Y年%b月%d日(%a)'); today_format3
[1] “2021年 3月19日(五)”
xi<-"1953-06-15" #Xi's birthday
tsai<-"1956-08-31" #Tsai's birthday
as.Date(c(xi,tsai), origin="1904-01-01")
[1] “1953-06-15” “1956-08-31”
difftime(tsai, xi)
Time difference of 1173 days
origin
指令可設定也可不設定。但是計算某一個數字代表的日期時必須要有起始日:
as.Date(1100, origin="2018-08-01")
[1] “2021-08-05”
介紹完資料型態之後,接下來介紹資料結構。
example<-c(0,1,2,3,4)
print(example)
[1] 0 1 2 3 4
或者是
c(2,4,6,8)->A
c(letters)
[1] “a” “b” “c” “d” “e” “f” “g” “h” “i” “j” “k” “l” “m” “n” “o” “p” “q” “r” “s” [20] “t” “u” “v” “w” “x” “y” “z”
c(LETTERS)
[1] “A” “B” “C” “D” “E” “F” “G” “H” “I” “J” “K” “L” “M” “N” “O” “P” “Q” “R” “S”
[20] “T” “U” “V” “W” “X” “Y” “Z”
shares <- c(150, 40, 65)
names(shares) <- c('Finance','Techonolgy','Cash')
shares
Finance Techonolgy Cash 150 40 65
class(shares)
[1] “numeric”
cash<-c(100, 120, 80, 65)
names(cash) <- c(2016, 2017, 2018, 2019)
par(mfrow=c(1,2), bg='lightgreen',mai=c(0.4,0.3,0.1,0.3))
pie(shares); barplot(cash, cex.axis = 0.8)
Figure 3.1: 資金配置
j<-c(2*2, 2*9, 10-2, 3^3); j
[1] 4 18 8 27
R<-c(100, 200, 300); R/5; sqrt(R)
[1] 20 40 60
[1] 10.00 14.14 17.32
c(j, R)
[1] 4 18 8 27 100 200 300
Y<-c(j, c(9:5), R[c(1,2)]); Y
[1] 4 18 8 27 9 8 7 6 5 100 200
R
的向量可以連結,我們可以增加資料的數量。
Y[-c(8:12)]
[1] 4 18 8 27 9 8 7
R
的資料結構之一是矩陣,例如VADeaths
就是一筆矩陣的資料:
data("VADeaths"); VADeaths
## Rural Male Rural Female Urban Male Urban Female
## 50-54 11.7 8.7 15.4 8.4
## 55-59 18.1 11.7 24.3 13.6
## 60-64 26.9 20.3 37.0 19.3
## 65-69 41.0 30.9 54.6 35.1
## 70-74 66.0 54.3 71.1 50.0
class(VADeaths)
## [1] "matrix" "array"
R
的矩陣類似。矩陣的讀法是先列再行。例如我們需要一個\(3\times 3\)的矩陣可寫成:
m<-matrix(c(1:9), nrow=3, ncol=3); m
## [,1] [,2] [,3]
## [1,] 1 4 7
## [2,] 2 5 8
## [3,] 3 6 9
n<-matrix(c(1:6), nrow=3, ncol=2);n
## [,1] [,2]
## [1,] 1 4
## [2,] 2 5
## [3,] 3 6
m<-matrix(c(1:9), nrow=3, ncol=3); n<-matrix(c(1:6), nrow=3, ncol=2); m%*%n
## [,1] [,2]
## [1,] 30 66
## [2,] 36 81
## [3,] 42 96
diag(m)
## [1] 1 5 9
t(m)
## [,1] [,2] [,3]
## [1,] 1 2 3
## [2,] 4 5 6
## [3,] 7 8 9
m[2,3]; m[2,3]<-0
## [1] 8
n<-matrix(c(1:6), nrow=3, ncol=2, byrow=T)
dimnames
的指令分別對列與行指定名稱,例如:
n<-matrix(c(1:6), nrow=3, ncol=2, dimnames = list(c("a","b","c"),c("A","B"))); n
## A B
## a 1 4
## b 2 5
## c 3 6
R1<-c(170, 175, 166, 172, 165, 157, 167, 167,
156, 160)
R2<-c("F","M","M","M","F","F","F","F","M","F")
R3<-R1/10 + 42
R123<-data.frame(height=R1,gender=R2,weight=R3,
stringsAsFactors = FALSE)
R123
## height gender weight
## 1 170 F 59.0
## 2 175 M 59.5
## 3 166 M 58.6
## 4 172 M 59.2
## 5 165 F 58.5
## 6 157 F 57.7
## 7 167 F 58.7
## 8 167 F 58.7
## 9 156 M 57.6
## 10 160 F 58.0
R
會當做矩陣。例如:
H<-cbind(LETTERS[1:6], seq(10,60, 10))
H
## [,1] [,2]
## [1,] "A" "10"
## [2,] "B" "20"
## [3,] "C" "30"
## [4,] "D" "40"
## [5,] "E" "50"
## [6,] "F" "60"
class(H)
## [1] "matrix" "array"
matrix[,1]
告訴系統向量位置,
不能用matrix$a。
## [1] "matrix" "array"
## a b
## [1,] "Monday" "52.5"
## [2,] "Tuesday" "48.4"
## [3,] "Wednesday" "57.1"
## [4,] "Thursday" "60.1"
## [5,] "Friday" "71.1"
tm<-cbind(carData::Chile$vote, carData::Chile$sex)
class(tm)
[1] “matrix” “array”
plot(table(tm[,2], tm[,1]))
dt <- as.data.frame(dftest)
dt$a<-factor(dt$a, levels=c("Monday", "Tuesday",
"Wednesday", "Thursday", "Friday"))
ggplot(data=dt, aes(x=a, y=b, fill=a)) +
geom_bar(stat = 'identity')
Figure 3.2: 週一至週五的快樂程度
有幾個資料框與矩陣的相關指令:
AMSsurvey
有幾筆觀察值:
nrow(AMSsurvey)
[1] 24
colnames(R123)<-c("v1","v2","v3"); R123
## v1 v2 v3
## 1 170 F 59.0
## 2 175 M 59.5
## 3 166 M 58.6
## 4 172 M 59.2
## 5 165 F 58.5
## 6 157 F 57.7
## 7 167 F 58.7
## 8 167 F 58.7
## 9 156 M 57.6
## 10 160 F 58.0
library(ISLR)
head(College, n=3)
## Private Apps Accept Enroll Top10perc Top25perc
## Abilene Christian University Yes 1660 1232 721 23 52
## Adelphi University Yes 2186 1924 512 16 29
## Adrian College Yes 1428 1097 336 22 50
## F.Undergrad P.Undergrad Outstate Room.Board Books
## Abilene Christian University 2885 537 7440 3300 450
## Adelphi University 2683 1227 12280 6450 750
## Adrian College 1036 99 11250 3750 400
## Personal PhD Terminal S.F.Ratio perc.alumni Expend
## Abilene Christian University 2200 70 78 18.1 12 7041
## Adelphi University 1500 29 30 12.2 16 10527
## Adrian College 1165 53 66 12.9 30 8735
## Grad.Rate
## Abilene Christian University 60
## Adelphi University 56
## Adrian College 54
head(state.x77, n=5)
## Population Income Illiteracy Life Exp Murder HS Grad Frost Area
## Alabama 3615 3624 2.1 69.05 15.1 41.3 20 50708
## Alaska 365 6315 1.5 69.31 11.3 66.7 152 566432
## Arizona 2212 4530 1.8 70.55 7.8 58.1 15 113417
## Arkansas 2110 3378 1.9 70.66 10.1 39.9 65 51945
## California 21198 5114 1.1 71.71 10.3 62.6 20 156361
names.to.delete<-c('Alabama', 'Alaska', 'Arkansas')
which(rownames(data) %in% vector)
傳回所要選出的列:
rows.to.delete<-which(rownames(state.x77) %in% names.to.delete)
newstate <- state.x77[-c(rows.to.delete),]
head(newstate, n=5)
## Population Income Illiteracy Life Exp Murder HS Grad Frost Area
## Arizona 2212 4530 1.8 70.55 7.8 58.1 15 113417
## California 21198 5114 1.1 71.71 10.3 62.6 20 156361
## Colorado 2541 4884 0.7 72.06 6.8 63.9 166 103766
## Connecticut 3100 5348 1.1 72.48 3.1 56.0 139 4862
## Delaware 579 4809 0.9 70.06 6.2 54.6 103 1982
Array1 <- array(1:12, dim = c(2, 6, 1)); Array1
## , , 1
##
## [,1] [,2] [,3] [,4] [,5] [,6]
## [1,] 1 3 5 7 9 11
## [2,] 2 4 6 8 10 12
Array2 <- array(1:12, dim = c(2, 3, 2)); Array2
## , , 1
##
## [,1] [,2] [,3]
## [1,] 1 3 5
## [2,] 2 4 6
##
## , , 2
##
## [,1] [,2] [,3]
## [1,] 7 9 11
## [2,] 8 10 12
A12<-Array2[,,2]; A12
## [,1] [,2] [,3]
## [1,] 7 9 11
## [2,] 8 10 12
listA<-list(R123, H, c(xi,tsai)); listA
## [[1]]
## v1 v2 v3
## 1 170 F 59.0
## 2 175 M 59.5
## 3 166 M 58.6
## 4 172 M 59.2
## 5 165 F 58.5
## 6 157 F 57.7
## 7 167 F 58.7
## 8 167 F 58.7
## 9 156 M 57.6
## 10 160 F 58.0
##
## [[2]]
## [,1] [,2]
## [1,] "A" "10"
## [2,] "B" "20"
## [3,] "C" "30"
## [4,] "D" "40"
## [5,] "E" "50"
## [6,] "F" "60"
##
## [[3]]
## [1] "1953-06-15" "1956-08-31"
list(A=data.frame(x=c(1:5),y=c(101:105)),
B=data.frame(v1=rep(NA,6)))
## $A
## x y
## 1 1 101
## 2 2 102
## 3 3 103
## 4 4 104
## 5 5 105
##
## $B
## v1
## 1 NA
## 2 NA
## 3 NA
## 4 NA
## 5 NA
## 6 NA
listA[[3]]
## [1] "1953-06-15" "1956-08-31"
listB<-list(data=R123, vec=m, char=c(tsai, xi));
listB[["data"]]
## v1 v2 v3
## 1 170 F 59.0
## 2 175 M 59.5
## 3 166 M 58.6
## 4 172 M 59.2
## 5 165 F 58.5
## 6 157 F 57.7
## 7 167 F 58.7
## 8 167 F 58.7
## 9 156 M 57.6
## 10 160 F 58.0
X = list(1:5, letters[1:5], c('Y','Y','N','Y','N'),
c("2/27/2018", "6/26/2018", "12/31/2018","1/20/2019","4/8/2019")); X
## [[1]]
## [1] 1 2 3 4 5
##
## [[2]]
## [1] "a" "b" "c" "d" "e"
##
## [[3]]
## [1] "Y" "Y" "N" "Y" "N"
##
## [[4]]
## [1] "2/27/2018" "6/26/2018" "12/31/2018" "1/20/2019" "4/8/2019"
X.dt<-setDT(X); X.dt
## V1 V2 V3 V4
## 1: 1 a Y 2/27/2018
## 2: 2 b Y 6/26/2018
## 3: 3 c N 12/31/2018
## 4: 4 d Y 1/20/2019
## 5: 5 e N 4/8/2019
請嘗試把c('a','b','c'), c(1,2,3,4)以及
c('2018-01-01', '2018-04-04', '2018-04-05', '2018-06-18', '2018-10-10')`結合成為一個列表。
class(Titanic); Titanic
## [1] "table"
## , , Age = Child, Survived = No
##
## Sex
## Class Male Female
## 1st 0 0
## 2nd 0 0
## 3rd 35 17
## Crew 0 0
##
## , , Age = Adult, Survived = No
##
## Sex
## Class Male Female
## 1st 118 4
## 2nd 154 13
## 3rd 387 89
## Crew 670 3
##
## , , Age = Child, Survived = Yes
##
## Sex
## Class Male Female
## 1st 5 1
## 2nd 11 13
## 3rd 13 14
## Crew 0 0
##
## , , Age = Adult, Survived = Yes
##
## Sex
## Class Male Female
## 1st 57 140
## 2nd 14 80
## 3rd 75 76
## Crew 192 20
T1<-Titanic[, , 1, 1]
class(T1); T1
## [1] "table"
## Sex
## Class Male Female
## 1st 0 0
## 2nd 0 0
## 3rd 35 17
## Crew 0 0
Titanic[, 1, 1,]
## Survived
## Class No Yes
## 1st 0 5
## 2nd 0 11
## 3rd 35 13
## Crew 0 0
Titanic[, 1, 2,]
## Survived
## Class No Yes
## 1st 118 57
## 2nd 154 14
## 3rd 387 75
## Crew 670 192
Titanic[, 2, 1,]
## Survived
## Class No Yes
## 1st 0 1
## 2nd 0 13
## 3rd 17 14
## Crew 0 0
Titanic[, 2, 2,]
## Survived
## Class No Yes
## 1st 4 140
## 2nd 13 80
## 3rd 89 76
## Crew 3 20
library(data.table)
DT = data.table(a = 1:3, b = c(10,20,30))
DT
## a b
## 1: 1 10
## 2: 2 20
## 3: 3 30
DT[1:3, sum(a)]
## [1] 6
DT[1:2, mean(b)]
## [1] 15
DT <-data.table(x=rnorm(1000, 0, 1))
DT[, plot(density(x), type='l', xlab='x', ylab='',
lwd=3, col='lightblue')]
Figure 3.3: 常態分佈機率密度
NULL
g<-Titanic[ , , 2, 2]; class(g)
## [1] "table"
請輸入 \(\texttt{g\$Class}\)結果發生錯誤,無法顯示。改為:
g<-data.frame(g)
請輸入 \(\texttt{g\$Class}\)則會顯示變數的性質。
因為向量具有方位的性質,所以數字具有先後順序,與另一個有同樣數目的向量相加減乘除時,將會依照順序進行運算。我們以一個純量 (scalar) Sca 為例:
X<-c(10,20,30,40,50,60); Sca<-10
X+Sca
[1] 20 30 40 50 60 70
X/Sca
[1] 1 2 3 4 5 6
Y<-c(5,10,6,8,25,6)
X/Y; X*Y
[1] 2 2 5 5 2 10 [1] 50 200 180 320 1250 360
a<-c(2,3,4); b<-a^2; print(b)
[1] 4 9 16
c<-sqrt(b); print(c)
[1] 2 3 4
a1<-c(2.54, 3.111, 10.999)
round(a1, digits=2)
## [1] 2.54 3.11 11.00
floor(a1)
## [1] 2 3 10
ceiling(a1)
## [1] 3 4 11
## [1] 2.54 3.11 11.00
R
為了對齊輸出的數字,會輸出比較多的小數點後面的數字,可以考慮用sprintf這個函式控制,但是會得到字串而非數字:
## [1] 6.00 0.91 3.29
## [1] "6.00" "0.91" "3.29"
weekdays()
指令顯示今天上課日期是星期二。
mtcars
資料裡面,wt大於或等於2的資料有幾筆?
data.table
這個套件的功能,排序上述的資料框。
最後更新日期 03/19/2021