# 1 Goal

The goal of this tutorial is to order a dataframe by one column in particular. This process is interesting if we want for example to sort products by volume of sales or by profit made.

# 2 Order dataframe

# First of all we load the data
# For this tutorial we are going to use the iris plant dataset
data(iris)
str(iris)
## 'data.frame':    150 obs. of  5 variables:
##  $Sepal.Length: num 5.1 4.9 4.7 4.6 5 5.4 4.6 5 4.4 4.9 ... ##$ Sepal.Width : num  3.5 3 3.2 3.1 3.6 3.9 3.4 3.4 2.9 3.1 ...
##  $Petal.Length: num 1.4 1.4 1.3 1.5 1.4 1.7 1.4 1.5 1.4 1.5 ... ##$ Petal.Width : num  0.2 0.2 0.2 0.2 0.2 0.4 0.3 0.2 0.2 0.1 ...
##  $Species : Factor w/ 3 levels "setosa","versicolor",..: 1 1 1 1 1 1 1 1 1 1 ... # We want to order our dataset from highest Sepal Length to lowest. # We use the order function like this: order(iris$Sepal.Length, decreasing = TRUE)
##   [1] 132 118 119 123 136 106 131 108 110 126 130 103  51  53 121 140 142
##  [18]  77 113 144  66  78  87 109 125 141 145 146  59  76  55 105 111 117
##  [35] 148  52  75 112 116 129 133 138  57  73  88 101 104 124 134 137 147
##  [52]  69  98 127 149  64  72  74  92 128 135  63  79  84  86 120 139  62
##  [69]  71 150  15  68  83  93 102 115 143  16  19  56  80  96  97 100 114
##  [86]  65  67  70  89  95 122  34  37  54  81  82  90  91   6  11  17  21
## [103]  32  85  49  28  29  33  60   1  18  20  22  24  40  45  47  99   5
## [120]   8  26  27  36  41  44  50  61  94   2  10  35  38  58 107  12  13
## [137]  25  31  46   3  30   4   7  23  48  42   9  39  43  14
# We obtain the index of the position of the plant in the ordered list

# Now we can obtain the order of the dataset
iris_ordered <- iris[order(iris$Sepal.Length, decreasing = TRUE), ] head(iris_ordered, 10) ## Sepal.Length Sepal.Width Petal.Length Petal.Width Species ## 132 7.9 3.8 6.4 2.0 virginica ## 118 7.7 3.8 6.7 2.2 virginica ## 119 7.7 2.6 6.9 2.3 virginica ## 123 7.7 2.8 6.7 2.0 virginica ## 136 7.7 3.0 6.1 2.3 virginica ## 106 7.6 3.0 6.6 2.1 virginica ## 131 7.4 2.8 6.1 1.9 virginica ## 108 7.3 2.9 6.3 1.8 virginica ## 110 7.2 3.6 6.1 2.5 virginica ## 126 7.2 3.2 6.0 1.8 virginica # We can plot the variable to see that all the dataset is properly ordered plot(iris_ordered$Sepal.Length)

# If we want this to be our new true order we can remove the row names to set them in the new order
rownames(iris_ordered) <- NULL
head(iris_ordered, 10)
##    Sepal.Length Sepal.Width Petal.Length Petal.Width   Species
## 1           7.9         3.8          6.4         2.0 virginica
## 2           7.7         3.8          6.7         2.2 virginica
## 3           7.7         2.6          6.9         2.3 virginica
## 4           7.7         2.8          6.7         2.0 virginica
## 5           7.7         3.0          6.1         2.3 virginica
## 6           7.6         3.0          6.6         2.1 virginica
## 7           7.4         2.8          6.1         1.9 virginica
## 8           7.3         2.9          6.3         1.8 virginica
## 9           7.2         3.6          6.1         2.5 virginica
## 10          7.2         3.2          6.0         1.8 virginica

# 3 Conclusion

In this tutorial we have learnt how to order a dataframe by the values of a specific column, then reorder the row names to the new configuration.