Sequence Along - Run Lengths

1. Artificial Data Set

Create an artificial data set comprising project types and project IDs.

##               PROJECT  PROJ_ID
## 1             housing LY01-098
## 2             housing LY01-100
## 3             housing LY01-102
## 4 school construction LY10-002
## 5          road works LY03-128
## 6          road works LY03-145

2.A `seq_along()`

This treats the three project types as contiguous groups, disregarding the gaps in sequencing.

myProjects%>% group_by(PROJECT) %>% mutate(SeqID = seq_along(PROJ_ID))

## # A tibble: 10 x 3
## # Groups:   PROJECT [3]
##    PROJECT             PROJ_ID  SeqID
##    <chr>               <chr>    <int>
##  1 housing             LY01-098     1
##  2 housing             LY01-100     2
##  3 housing             LY01-102     3
##  4 school construction LY10-002     1
##  5 road works          LY03-128     1
##  6 road works          LY03-145     2
##  7 housing             LY05-082     4
##  8 road works          LY05-013     3
##  9 road works          LY06-028     4
## 10 school construction LY09-005     2

2.B `row_number()`

Working the same way as seq_along() , this treats the three project types as contiguous groups, disregarding the gaps in sequencing.

myProjects%>% group_by(PROJECT) %>% mutate(SeqID = row_number())

## # A tibble: 10 x 3
## # Groups:   PROJECT [3]
##    PROJECT             PROJ_ID  SeqID
##    <chr>               <chr>    <int>
##  1 housing             LY01-098     1
##  2 housing             LY01-100     2
##  3 housing             LY01-102     3
##  4 school construction LY10-002     1
##  5 road works          LY03-128     1
##  6 road works          LY03-145     2
##  7 housing             LY05-082     4
##  8 road works          LY05-013     3
##  9 road works          LY06-028     4
## 10 school construction LY09-005     2

3 `cur_group_id()`

myProjects%>% group_by(PROJECT) %>% dplyr::mutate(ID = cur_group_id(), .after=PROJECT)

## # A tibble: 10 x 3
## # Groups:   PROJECT [3]
##    PROJECT                ID PROJ_ID 
##    <chr>               <int> <chr>   
##  1 housing                 1 LY01-098
##  2 housing                 1 LY01-100
##  3 housing                 1 LY01-102
##  4 school construction     3 LY10-002
##  5 road works              2 LY03-128
##  6 road works              2 LY03-145
##  7 housing                 1 LY05-082
##  8 road works              2 LY05-013
##  9 road works              2 LY06-028
## 10 school construction     3 LY09-005

4. Data.table - RLE ID

This accounts for distinct sequences runs are created. N.B. This command does not use the dplyr::group_by() function.

myProjects%>% mutate(RunID = data.table::rleid(PROJECT))

##                PROJECT  PROJ_ID RunID
## 1              housing LY01-098     1
## 2              housing LY01-100     1
## 3              housing LY01-102     1
## 4  school construction LY10-002     2
## 5           road works LY03-128     3
## 6           road works LY03-145     3
## 7              housing LY05-082     4
## 8           road works LY05-013     5
## 9           road works LY06-028     5
## 10 school construction LY09-005     6

5. Sequence Indices within Runs

myProjects%>% 
  mutate(RunID = data.table::rleid(PROJECT)) %>% 
  group_by(RunID) %>% 
  dplyr::mutate(SeqID = row_number())

## # A tibble: 10 x 4
## # Groups:   RunID [6]
##    PROJECT             PROJ_ID  RunID SeqID
##    <chr>               <chr>    <int> <int>
##  1 housing             LY01-098     1     1
##  2 housing             LY01-100     1     2
##  3 housing             LY01-102     1     3
##  4 school construction LY10-002     2     1
##  5 road works          LY03-128     3     1
##  6 road works          LY03-145     3     2
##  7 housing             LY05-082     4     1
##  8 road works          LY05-013     5     1
##  9 road works          LY06-028     5     2
## 10 school construction LY09-005     6     1

Sequence Along - Run Lengths

2022-11-09

1. Artificial Data Set

2.A seq_along()

2.B row_number()

3 cur_group_id()

4. Data.table - RLE ID

5. Sequence Indices within Runs

2.A `seq_along()`

2.B `row_number()`

3 `cur_group_id()`