data_sort_order_arrange

sort order rank

sort(x)是对向量x进行排序，返回值排序后的数值向量.

x<-c(97,93,85,74,32,100,99,67)  
sort(x)

## [1]  32  67  74  85  93  97  99 100

rank()的返回值是这组学生所对应的排名

rank(x)

## [1] 6 5 4 3 1 8 7 2

#rank排序是要注意方法
rank(sample(3, 7, replace = TRUE), ties.method = 'first')

## [1] 5 1 3 2 4 6 7

order()的返回值是各个排名的学生成绩所在向量中的位置,这个对处理数据有关系

order(x, decreasing=F)

## [1] 5 8 4 3 2 1 7 6

x[order(x)] == sort(x) ###h恒等式

## [1] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE

### order 对dataframe操作很有用
data("iris")
#单数据列，两者相同  
head(sort(iris$Sepal.Length),1) == head(iris$Sepal.Length[order(iris$Sepal.Length)],1)

## [1] TRUE

#多列排序问题
x <- c(3,5,4,6,3,2,1,4,3,2)  
y <- c('c','c','d','b','a','b','d','e','e','d')  
z <- c(1,2,3,4,5,6,7,8,9,10)  
testData <- data.frame(x=x,y=y,z=z) 
#同时该函数可以接受多个参数进行排序，第一个参数是主排序的依据列，第二个是次级排序依据列
o <- order(testData[,"x"],testData[,"y"])  
testData[o,]

##    x y  z
## 7  1 d  7
## 6  2 b  6
## 10 2 d 10
## 5  3 a  5
## 1  3 c  1
## 9  3 e  9
## 3  4 d  3
## 8  4 e  8
## 2  5 c  2
## 4  6 b  4

dplyr-arrange

library("dplyr")

## 
## Attaching package: 'dplyr'

## The following objects are masked from 'package:stats':
## 
##     filter, lag

## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union

data <- data.frame(stateName=LETTERS,
  lifeExpectancy=rbinom(26,90,0.8),literacy=runif(26,0.9,0.99))

data %>%
  arrange(desc(lifeExpectancy),desc(literacy)) %>%
  select(stateName,literacy) %>%
  head(n=5)

##   stateName  literacy
## 1         K 0.9608098
## 2         Q 0.9506752
## 3         H 0.9010453
## 4         E 0.9718997
## 5         N 0.9580449

head(arrange(data, lifeExpectancy, desc(literacy))) #直接返回data.frame

##   stateName lifeExpectancy  literacy
## 1         F             64 0.9769230
## 2         L             65 0.9504690
## 3         A             65 0.9243058
## 4         D             67 0.9831950
## 5         M             69 0.9872442
## 6         Y             70 0.9893188

data_sort_order_arrange_rank

Xshi0001

2018年4月8日

sort order rank

sort(x)是对向量x进行排序，返回值排序后的数值向量.

rank()的返回值是这组学生所对应的排名

order()的返回值是各个排名的学生成绩所在向量中的位置,这个对处理数据有关系

dplyr-arrange