install bupa

install.packages("bupaR")
install.packages("edeaR")
install.packages("eventdataR")
install.packages("processmapR")
install.packages("processmonitR")
install.packages("xesreadR")
install.packages("petrinetR")

创建日志文件

综上所述,数据中的每一行都应该是一个事件,其中至少包含 6 条不同的所需信息:

  1. 时间戳 time stamp
  2. 案例标识符 case id
  3. 活动标签 activity id
  4. 活动实例标识符 activity instance id
  5. 事务性生命周期阶段 life cycle id
  6. 资源标识符 resource id

此外,可以添加任意数量的自定义事件属性,例如成本。

事件日志对象

通常将数据框数据转变成为日志对象

library(bupaR)
## 
## Attaching package: 'bupaR'
## The following object is masked from 'package:stats':
## 
##     filter
## The following object is masked from 'package:utils':
## 
##     timestamp
library(lubridate)
## 
## Attaching package: 'lubridate'
## The following objects are masked from 'package:base':
## 
##     date, intersect, setdiff, union
data <- data.frame(case = rep("A",5),
activity_id = c("A","B","C","D","E"),
activity_instance_id = 1:5,
lifecycle_id = rep("complete",5),
timestamp = now()+ddays(5),
resource = rep("resource 1", 5))

Meventlog <- eventlog(data,case_id = "case",
activity_id = "activity_id",
activity_instance_id = "activity_instance_id",
lifecycle_id = "lifecycle_id",
timestamp = "timestamp",
resource_id = "resource")


Meventlog
## Log of 5 events consisting of:
## 1 trace 
## 1 case 
## 5 instances of 5 activities 
## 1 resource 
## Events occurred from 2022-03-24 02:19:45 until 2022-03-24 02:19:45 
##  
## Variables were mapped as follows:
## Case identifier:     case 
## Activity identifier:     activity_id 
## Resource identifier:     resource 
## Activity instance identifier:    activity_instance_id 
## Timestamp:           timestamp 
## Lifecycle transition:        lifecycle_id 
## 
## # A tibble: 5 × 7
##   case  activity_id activity_instance… lifecycle_id timestamp           resource
##   <chr> <fct>       <chr>              <fct>        <dttm>              <fct>   
## 1 A     A           1                  complete     2022-03-24 02:19:45 resourc…
## 2 A     B           2                  complete     2022-03-24 02:19:45 resourc…
## 3 A     C           3                  complete     2022-03-24 02:19:45 resourc…
## 4 A     D           4                  complete     2022-03-24 02:19:45 resourc…
## 5 A     E           5                  complete     2022-03-24 02:19:45 resourc…
## # … with 1 more variable: .order <int>

缺乏过渡生命周期

很多情况下,数据没有记录在低级别的事务中,但每个活动实例只记录一个时间戳。在这种情况下,一个事件相当于一个活动实例。

这样,可以手动添加生命周期转换,以及每行唯一的活动实例 ID。

example_log_2 %>%
    mutate(status = "complete",
           activity_instance = 1:nrow(.)) %>%
    eventlog(
        case_id = "patient",
        activity_id = "activity",
        activity_instance_id = "activity_instance",
        lifecycle_id = "status",
        timestamp = "timestamp",
        resource_id = "resource"
    )

缺乏资源

为了解决这个问题,最简单的解决方案是包含一个空的资源变量。

example_log_3 %>%
    mutate(resource = NA) %>%
        eventlog(
        case_id = "patient",
        activity_id = "activity",
        activity_instance_id = "activity_instance",
        lifecycle_id = "status",
        timestamp = "timestamp",
        resource_id = "resource"
    )

数据处理

数据评估函数

安装包:daqapo

library(daqapo)
## 
## Attaching package: 'daqapo'
## The following object is masked from 'package:eventdataR':
## 
##     hospital
## The following object is masked from 'package:utils':
## 
##     fix

subsetting event data

event filter

edeaR packages

  1. filter_activity
  2. filter_activity_frequency
  3. filter_resource
  4. filter_resource_frequency
  5. filter_trim
  6. filter_time_period

case filter

  1. filter_throughput_time
  2. filter_processing_time
  3. filter_trace_length
  4. filter_activity_presence
  5. filter_endpoints
  6. filter_precedence
  7. filter_trace_frequency
  8. filter_time_period

Aggregating event data

bupaR支持多种策略来缩小过于详细的活动。一种选择是消除相似活动类型之间的区别;将它们更改为统一名称。第二个选项是折叠属于更高级别名称下的子流程的活动。第一个聚合我们称为“IS-A”聚合,第二个聚合称为“PART-OF”聚合。

Is-a aggregation

library(processmapR)
patients %>% 
    process_map()

在我们想要缩小这个过程的假设情况下,我们可以说 MRI SCAN 和 X-Ray 都是Scans。也就是说,MRI SCAN是扫描,X 射线是扫描。因此,我们可以执行 is-a 聚合。为此,我们使用该act_unite功能,因为我们unite有两个或多个活动。我们将看到 236 次 MRI 扫描和 261 次 X 射线被替换为 497 次扫描。

patients %>%
    act_unite(Scan = c("MRI SCAN","X-Ray")) %>%
    process_map()

Part-of aggregation

另一种方法是组合不相似但作为子流程的一部分属于一起的活动。假设我们会说 X 射线、MRI 扫描和验血是子过程“测试”的一部分。我们可以将这些活动的发生合并为一个活动。这可以通过act_collapse函数来​​实现。

patients %>% 
    act_collapse(Testing = c("MRI SCAN","X-Ray","Blood test")) %>%
    process_map()

Recoding individual activities

有时重新编码单个活动级别很有用,例如当出现一些错字时,或者当您想要为不同的标签提供更统一的格式时。单独的重新编码可以使用act_recode.

patients %>%
    act_recode("Check-in" = "Registration",
               "MRI Scan" = "MRI SCAN") %>%
    process_map()

Enriching event data

就是生成新的变量

Appending metrics

此处定义的指标不仅可以单独计算,还可以直接作为附加信息添加到事件日志中。这在案例级别最有用,但也支持级别活动、资源和资源活动(如果可用)。

可以通过调用具有适当级别的指标并设置append = TRUE参数来将指标附加到事件数据中。例如,考虑吞吐量时间。

patients %>%
    throughput_time(level = "case",append = TRUE)
## Log of 5442 events consisting of:
## 7 traces 
## 500 cases 
## 2721 instances of 7 activities 
## 7 resources 
## Events occurred from 2017-01-02 11:41:53 until 2018-05-05 07:16:02 
##  
## Variables were mapped as follows:
## Case identifier:     patient 
## Activity identifier:     handling 
## Resource identifier:     employee 
## Activity instance identifier:    handling_id 
## Timestamp:           time 
## Lifecycle transition:        registration_type 
## 
## # A tibble: 5,442 × 8
##    handling     patient employee handling_id registration_type time               
##    <fct>        <chr>   <fct>    <chr>       <fct>             <dttm>             
##  1 Registration 1       r1       1           start             2017-01-02 11:41:53
##  2 Registration 2       r1       2           start             2017-01-02 11:41:53
##  3 Registration 3       r1       3           start             2017-01-04 01:34:05
##  4 Registration 4       r1       4           start             2017-01-04 01:34:04
##  5 Registration 5       r1       5           start             2017-01-04 16:07:47
##  6 Registration 6       r1       6           start             2017-01-04 16:07:47
##  7 Registration 7       r1       7           start             2017-01-05 04:56:11
##  8 Registration 8       r1       8           start             2017-01-05 04:56:11
##  9 Registration 9       r1       9           start             2017-01-06 05:58:54
## 10 Registration 10      r1       10          start             2017-01-06 05:58:54
## # … with 5,432 more rows, and 2 more variables: .order <int>,
## #   throughput_time_case <dbl>

一个新变量“throughput_time_case”现在已作为案例属性添加到事件日志中。这个新属性可以在以后的分析中直接使用。

对于某些指标,有多个输出值作为候选附加值。例如,考虑跟踪覆盖度量的输出。

patients %>% 
    trace_coverage(level = "case")
## # A tibble: 500 × 4
##    patient trace                                               absolute relative
##    <chr>   <chr>                                                  <int>    <dbl>
##  1 2       Registration,Triage and Assessment,X-Ray,Discuss R…      258    0.516
##  2 5       Registration,Triage and Assessment,X-Ray,Discuss R…      258    0.516
##  3 8       Registration,Triage and Assessment,X-Ray,Discuss R…      258    0.516
##  4 9       Registration,Triage and Assessment,X-Ray,Discuss R…      258    0.516
##  5 10      Registration,Triage and Assessment,X-Ray,Discuss R…      258    0.516
##  6 11      Registration,Triage and Assessment,X-Ray,Discuss R…      258    0.516
##  7 14      Registration,Triage and Assessment,X-Ray,Discuss R…      258    0.516
##  8 17      Registration,Triage and Assessment,X-Ray,Discuss R…      258    0.516
##  9 18      Registration,Triage and Assessment,X-Ray,Discuss R…      258    0.516
## 10 19      Registration,Triage and Assessment,X-Ray,Discuss R…      258    0.516
## # … with 490 more rows

我们获得跟踪覆盖的案例的绝对数量以及相对数量。只有这些变量中的一个会被附加,并且会为每个指标自动选择哪一个。下面的结果显示附加了绝对值。

patients %>%
    trace_coverage(level = "case",append = TRUE)
## Log of 5442 events consisting of:
## 7 traces 
## 500 cases 
## 2721 instances of 7 activities 
## 7 resources 
## Events occurred from 2017-01-02 11:41:53 until 2018-05-05 07:16:02 
##  
## Variables were mapped as follows:
## Case identifier:     patient 
## Activity identifier:     handling 
## Resource identifier:     employee 
## Activity instance identifier:    handling_id 
## Timestamp:           time 
## Lifecycle transition:        registration_type 
## 
## # A tibble: 5,442 × 9
##    handling     patient employee handling_id registration_type time               
##    <fct>        <chr>   <fct>    <chr>       <fct>             <dttm>             
##  1 Registration 1       r1       1           start             2017-01-02 11:41:53
##  2 Registration 2       r1       2           start             2017-01-02 11:41:53
##  3 Registration 3       r1       3           start             2017-01-04 01:34:05
##  4 Registration 4       r1       4           start             2017-01-04 01:34:04
##  5 Registration 5       r1       5           start             2017-01-04 16:07:47
##  6 Registration 6       r1       6           start             2017-01-04 16:07:47
##  7 Registration 7       r1       7           start             2017-01-05 04:56:11
##  8 Registration 8       r1       8           start             2017-01-05 04:56:11
##  9 Registration 9       r1       9           start             2017-01-06 05:58:54
## 10 Registration 10      r1       10          start             2017-01-06 05:58:54
## # … with 5,432 more rows, and 3 more variables: .order <int>, trace <chr>,
## #   absolute_case_trace_coverage <int>

要更改此默认值,append_column可以设置参数。例如,我们可以改为附加相对覆盖率。

patients %>%
    trace_coverage(level = "case",append = TRUE, append_column = "relative") 
## Log of 5442 events consisting of:
## 7 traces 
## 500 cases 
## 2721 instances of 7 activities 
## 7 resources 
## Events occurred from 2017-01-02 11:41:53 until 2018-05-05 07:16:02 
##  
## Variables were mapped as follows:
## Case identifier:     patient 
## Activity identifier:     handling 
## Resource identifier:     employee 
## Activity instance identifier:    handling_id 
## Timestamp:           time 
## Lifecycle transition:        registration_type 
## 
## # A tibble: 5,442 × 9
##    handling     patient employee handling_id registration_type time               
##    <fct>        <chr>   <fct>    <chr>       <fct>             <dttm>             
##  1 Registration 1       r1       1           start             2017-01-02 11:41:53
##  2 Registration 2       r1       2           start             2017-01-02 11:41:53
##  3 Registration 3       r1       3           start             2017-01-04 01:34:05
##  4 Registration 4       r1       4           start             2017-01-04 01:34:04
##  5 Registration 5       r1       5           start             2017-01-04 16:07:47
##  6 Registration 6       r1       6           start             2017-01-04 16:07:47
##  7 Registration 7       r1       7           start             2017-01-05 04:56:11
##  8 Registration 8       r1       8           start             2017-01-05 04:56:11
##  9 Registration 9       r1       9           start             2017-01-06 05:58:54
## 10 Registration 10      r1       10          start             2017-01-06 05:58:54
## # … with 5,432 more rows, and 3 more variables: .order <int>, trace <chr>,
## #   relative_case_trace_coverage <dbl>

Custom enrichment

在指标旁边,可以进行更多定制的扩充。假设我们想要指出哪些患者进行了 MRI-SCAN。使用mutate,我们可以这样做。

patients %>%
    group_by_case %>%
    mutate(had_MRI = any(handling == "MRI SCAN")) %>%
    ungroup_eventlog()
## Log of 5442 events consisting of:
## 7 traces 
## 500 cases 
## 2721 instances of 7 activities 
## 7 resources 
## Events occurred from 2017-01-02 11:41:53 until 2018-05-05 07:16:02 
##  
## Variables were mapped as follows:
## Case identifier:     patient 
## Activity identifier:     handling 
## Resource identifier:     employee 
## Activity instance identifier:    handling_id 
## Timestamp:           time 
## Lifecycle transition:        registration_type 
## 
## # A tibble: 5,442 × 8
##    handling     patient employee handling_id registration_type time               
##    <fct>        <chr>   <fct>    <chr>       <fct>             <dttm>             
##  1 Registration 1       r1       1           start             2017-01-02 11:41:53
##  2 Registration 2       r1       2           start             2017-01-02 11:41:53
##  3 Registration 3       r1       3           start             2017-01-04 01:34:05
##  4 Registration 4       r1       4           start             2017-01-04 01:34:04
##  5 Registration 5       r1       5           start             2017-01-04 16:07:47
##  6 Registration 6       r1       6           start             2017-01-04 16:07:47
##  7 Registration 7       r1       7           start             2017-01-05 04:56:11
##  8 Registration 8       r1       8           start             2017-01-05 04:56:11
##  9 Registration 9       r1       9           start             2017-01-06 05:58:54
## 10 Registration 10      r1       10          start             2017-01-06 05:58:54
## # … with 5,432 more rows, and 2 more variables: .order <int>, had_MRI <lgl>

请注意,该group_by_case函数是一个有用的函数,可以按案例 id 对数据进行分组。因此,mutate 将在每种情况下分别查找 MRI SCAN。该ungroup_eventlog函数删除了分组,以便以后的分析不受此影响。

Refining enriched data

使用mutate,总是可以进一步细化丰富的变量。例如,在附加相对跟踪覆盖率之后,我们可以创建一个变量来指示一个案例是遵循频繁路径还是不频繁路径。以下代码添加了一个变量frequent,如果超过 20% 的案例共享相同的跟踪,则该变量为 TRUE。

patients %>%
    trace_coverage(level = "case",append = TRUE, append_column = "relative") %>%
    mutate(frequent = relative_case_trace_coverage > 0.2)
## Log of 5442 events consisting of:
## 7 traces 
## 500 cases 
## 2721 instances of 7 activities 
## 7 resources 
## Events occurred from 2017-01-02 11:41:53 until 2018-05-05 07:16:02 
##  
## Variables were mapped as follows:
## Case identifier:     patient 
## Activity identifier:     handling 
## Resource identifier:     employee 
## Activity instance identifier:    handling_id 
## Timestamp:           time 
## Lifecycle transition:        registration_type 
## 
## # A tibble: 5,442 × 10
##    handling     patient employee handling_id registration_type time               
##    <fct>        <chr>   <fct>    <chr>       <fct>             <dttm>             
##  1 Registration 1       r1       1           start             2017-01-02 11:41:53
##  2 Registration 2       r1       2           start             2017-01-02 11:41:53
##  3 Registration 3       r1       3           start             2017-01-04 01:34:05
##  4 Registration 4       r1       4           start             2017-01-04 01:34:04
##  5 Registration 5       r1       5           start             2017-01-04 16:07:47
##  6 Registration 6       r1       6           start             2017-01-04 16:07:47
##  7 Registration 7       r1       7           start             2017-01-05 04:56:11
##  8 Registration 8       r1       8           start             2017-01-05 04:56:11
##  9 Registration 9       r1       9           start             2017-01-06 05:58:54
## 10 Registration 10      r1       10          start             2017-01-06 05:58:54
## # … with 5,432 more rows, and 4 more variables: .order <int>, trace <chr>,
## #   relative_case_trace_coverage <dbl>, frequent <lgl>

然后可以将新属性包括在进一步的分析中。例如,频繁跟踪和不频繁跟踪之间的吞吐量时间是否不同?

patients %>%
    trace_coverage(level = "case",append = TRUE, append_column = "relative") %>%
    mutate(frequent = relative_case_trace_coverage > 0.2) %>%
    group_by(frequent) %>%
    processing_time() 
## # A tibble: 2 × 9
##   frequent   min    q1 median  mean    q3   max st_dev   iqr
##   <lgl>    <dbl> <dbl>  <dbl> <dbl> <dbl> <dbl>  <dbl> <dbl>
## 1 TRUE     0.723 1.04   1.16  1.16   1.28  1.59  0.169 0.239
## 2 FALSE    0.447 0.734  0.951 0.901  1.08  1.28  0.297 0.344

Wrangling event data

group_by

  1. group_by
  2. group_by_case- 按案例分组
  3. group_by_activity- 按活动类型分组
  4. group_by_resource- 按资源分组
  5. group_by_activity_resource- 按活动资源对分组
  6. group_by_activity_instance- 按活动实例分组。

当不再需要分组时,可以使用该ungroup_eventlog功能将其删除。

mutate

  1. mutate

filter

  1. filter

elect

  1. select

arrange

slice

1.使用slice:取一个案例 2.使用slice_activities:获取活动实例的一部分 3.使用slice_events:获取事件片段

frist_n,last_n

sample

其实这部分就和普通的数据处理方式一样.

Adjusting view

事件日志的视图由映射到特定特征的不同变量定义:

1.案例标识符 ( case_id) 2. 活动信息 3. 活动类型 ( activity_id) 4. 活动实例 ( activity_instance_id) 5. 交易状态 ( lifecycle_id) 6. 时间戳 ( timestamp) 7. 资源 ( resource)

Using eventlog function

该eventlog函数不仅用于实例化和事件日志对象,还可以用于修改它,通过使用事件日志对象作为输入并仅设置想要更改的标识符。

修改case id

traffic_fines %>%
    eventlog(case_id = "vehicleclass")
## Log of 34724 events consisting of:
## 4 traces 
## 4 cases 
## 34724 instances of 11 activities 
## 16 resources 
## Events occurred from 2006-06-17 until 2012-03-26 
##  
## Variables were mapped as follows:
## Case identifier:     vehicleclass 
## Activity identifier:     activity 
## Resource identifier:     resource 
## Activity instance identifier:    activity_instance_id 
## Timestamp:           timestamp 
## Lifecycle transition:        lifecycle 
## 
## # A tibble: 34,724 × 18
##    case_id activity        lifecycle resource timestamp           amount article
##    <chr>   <fct>           <fct>     <fct>    <dttm>               <dbl>   <int>
##  1 A1      Create Fine     complete  561      2006-07-24 00:00:00    350     157
##  2 A1      Send Fine       complete  <NA>     2006-12-05 00:00:00     NA      NA
##  3 A100    Create Fine     complete  561      2006-08-02 00:00:00    350     157
##  4 A100    Send Fine       complete  <NA>     2006-12-12 00:00:00     NA      NA
##  5 A100    Insert Fine No… complete  <NA>     2007-01-15 00:00:00     NA      NA
##  6 A100    Add penalty     complete  <NA>     2007-03-16 00:00:00    715      NA
##  7 A100    Send for Credi… complete  <NA>     2009-03-30 00:00:00     NA      NA
##  8 A10000  Create Fine     complete  561      2007-03-09 00:00:00    360     157
##  9 A10000  Send Fine       complete  <NA>     2007-07-17 00:00:00     NA      NA
## 10 A10000  Insert Fine No… complete  <NA>     2007-08-02 00:00:00     NA      NA
## # … with 34,714 more rows, and 11 more variables: dismissal <chr>,
## #   expense <dbl>, lastsent <chr>, matricola <chr>, notificationtype <chr>,
## #   paymentamount <dbl>, points <int>, totalpaymentamount <chr>,
## #   vehicleclass <chr>, activity_instance_id <chr>, .order <int>

Using set functions

如果我们只想更改其中一个元素,如上例所示,该set函数提供了一种非常方便的方法。可以按如下方式进行与以前相同的更改。

traffic_fines %>%
    set_case_id("vehicleclass")
## Log of 34724 events consisting of:
## 4 traces 
## 4 cases 
## 34724 instances of 11 activities 
## 16 resources 
## Events occurred from 2006-06-17 until 2012-03-26 
##  
## Variables were mapped as follows:
## Case identifier:     vehicleclass 
## Activity identifier:     activity 
## Resource identifier:     resource 
## Activity instance identifier:    activity_instance_id 
## Timestamp:           timestamp 
## Lifecycle transition:        lifecycle 
## 
## # A tibble: 34,724 × 18
##    case_id activity        lifecycle resource timestamp           amount article
##    <chr>   <fct>           <fct>     <fct>    <dttm>               <dbl>   <int>
##  1 A1      Create Fine     complete  561      2006-07-24 00:00:00    350     157
##  2 A1      Send Fine       complete  <NA>     2006-12-05 00:00:00     NA      NA
##  3 A100    Create Fine     complete  561      2006-08-02 00:00:00    350     157
##  4 A100    Send Fine       complete  <NA>     2006-12-12 00:00:00     NA      NA
##  5 A100    Insert Fine No… complete  <NA>     2007-01-15 00:00:00     NA      NA
##  6 A100    Add penalty     complete  <NA>     2007-03-16 00:00:00    715      NA
##  7 A100    Send for Credi… complete  <NA>     2009-03-30 00:00:00     NA      NA
##  8 A10000  Create Fine     complete  561      2007-03-09 00:00:00    360     157
##  9 A10000  Send Fine       complete  <NA>     2007-07-17 00:00:00     NA      NA
## 10 A10000  Insert Fine No… complete  <NA>     2007-08-02 00:00:00     NA      NA
## # … with 34,714 more rows, and 11 more variables: dismissal <chr>,
## #   expense <dbl>, lastsent <chr>, matricola <chr>, notificationtype <chr>,
## #   paymentamount <dbl>, points <int>, totalpaymentamount <chr>,
## #   vehicleclass <chr>, activity_instance_id <chr>, .order <int>

Using existing mapping

也可以在某个时间点对事件日志映射进行快照,并在以后重用它。可以使用该mapping函数提取映射。

mapping_fines <- mapping(traffic_fines)
mapping_fines
## Case identifier:     case_id 
## Activity identifier:     activity 
## Resource identifier:     resource 
## Activity instance identifier:    activity_instance_id 
## Timestamp:           timestamp 
## Lifecycle transition:        lifecycle

我们可以使用上述方法逐步调整映射。

traffic_fines %>%
    set_case_id("vehicleclass") %>%
    set_activity_id("notificationtype") -> traffic_fines

re_map稍后,我们可以撤消这些更改并使用该函数“重置”原始映射。

traffic_fines %>%
    re_map(mapping_fines)
## Log of 34724 events consisting of:
## 44 traces 
## 10000 cases 
## 34724 instances of 11 activities 
## 16 resources 
## Events occurred from 2006-06-17 until 2012-03-26 
##  
## Variables were mapped as follows:
## Case identifier:     case_id 
## Activity identifier:     activity 
## Resource identifier:     resource 
## Activity instance identifier:    activity_instance_id 
## Timestamp:           timestamp 
## Lifecycle transition:        lifecycle 
## 
## # A tibble: 34,724 × 18
##    case_id activity        lifecycle resource timestamp           amount article
##    <chr>   <fct>           <fct>     <fct>    <dttm>               <dbl>   <int>
##  1 A1      Create Fine     complete  561      2006-07-24 00:00:00    350     157
##  2 A1      Send Fine       complete  <NA>     2006-12-05 00:00:00     NA      NA
##  3 A100    Create Fine     complete  561      2006-08-02 00:00:00    350     157
##  4 A100    Send Fine       complete  <NA>     2006-12-12 00:00:00     NA      NA
##  5 A100    Insert Fine No… complete  <NA>     2007-01-15 00:00:00     NA      NA
##  6 A100    Add penalty     complete  <NA>     2007-03-16 00:00:00    715      NA
##  7 A100    Send for Credi… complete  <NA>     2009-03-30 00:00:00     NA      NA
##  8 A10000  Create Fine     complete  561      2007-03-09 00:00:00    360     157
##  9 A10000  Send Fine       complete  <NA>     2007-07-17 00:00:00     NA      NA
## 10 A10000  Insert Fine No… complete  <NA>     2007-08-02 00:00:00     NA      NA
## # … with 34,714 more rows, and 11 more variables: dismissal <chr>,
## #   expense <dbl>, lastsent <chr>, matricola <chr>, notificationtype <fct>,
## #   paymentamount <dbl>, points <int>, totalpaymentamount <chr>,
## #   vehicleclass <chr>, activity_instance_id <chr>, .order <int>

EDA

Describing event data

Time perspective

可以计算三种不同的时间度量:

吞吐量时间:案例的第一个事件和最后一个事件之间的时间 处理时间:所有活动实例的持续时间之和 空闲时间:没有活动实例处于活动状态的时间 活动实例的持续时间是与该活动实例相关的第一个事件和最后一个事件之间的时间。如果一个案例中的多个活动实例重叠,则该重叠的处理时间将计算两次。下图显示了不同时间度量的示意图。

  1. 空闲时间:idle_time 空闲时间是案例或资源没有活动的时间。只有在活动实例同时存在开始和结束时间戳时才能计算它。它可以在跟踪、资源、案例和日志级别上计算,并使用不同的时间单位。

  2. 处理时间:processing_time 可以在日志、跟踪、案例、活动和资源活动级别计算处理时间。只有在活动实例同时存在开始和结束时间戳时才能计算它。

  3. 吞吐时间:throughput_time 吞吐量时间是一个案例从第一个事件到最后一个事件的时间。可以计算它的级别是日志、跟踪或案例。 ### Organizational Perspective

  4. 资源频率:throughput_time 资源频率度量允许在日志、案例、活动、资源和资源活动级别计算资源的数量/频率。

  5. 资源参与:resource_involvement 资源参与是指涉及资源的案例数量的概念。它可以在案例、资源和资源活动级别进行计算。

  6. 资源专业化:resource_specialisation 资源专门化指标显示资源是否专门用于某些活动。它可以在日志、案例、资源和活动级别进行计算。

Structuredness

processmapR pachages

  1. activity_presence
  2. activity_frequency
  3. start_activities
  4. end_activities
  5. trace_coverage 跟踪覆盖率度量显示不同活动序列(即跟踪)的数量与它们覆盖的案例数量之间的关系。
  6. trace_length 跟踪长度度量描述了跟踪的长度,即每个案例的活动实例数。它可以在 case、trace 和 log 级别进行计算。

Creating Process Maps

该功能process_map可以很容易地用于创建事件日志的流程图。

还可以进行一些自定义 https://bupar.net/processmaps.html

Animate Process Maps

library(processanimateR)
library(eventdataR)
animate_process(patient)

Precedence Matrices

显示活动之间流动的二维矩阵。通过调整类型参数,它可以包含不同的类型值。

1.绝对流动频率 2.流动的相对频率 3.每个前件的相对流动频率 即给定前件 A,它后面有 x% 的时间是后件 B 4. 流动的相对频率,对于每个结果 即给定后件 B,它之前有 x% 的时间是先件 A

可以使用通用绘图功能可视化优先图。

precedence_matrix

type :absolute,relative,relative-antecedent,relative-consequent

Dotted charts

dotted_chart使用该功能可以制作虚线图。虚线图是一个图表,其中每个活动实例都以一个点显示。x 轴指的是时间方面,而 y 轴指的是案例。虚线图函数有 3 个参数

1.x:绝对(x 轴上的绝对时间)或相对(自 x 轴上开始情况以来的时间差) 2.y:案例沿 y 轴的顺序:按开始、结束或持续时间。 3.color:用于为活动实例着色的属性。默认为活动类型。

  1. dotted_chart :绘图函数
  2. idotted_chart :shiny 程序

Trace explorer

它可用于探索频繁和不频繁的踪迹。覆盖率参数指定您想要探索多少日志。默认情况下,它设置为 0.2,这意味着它将显示覆盖事件日志 20% 的最多(内)频率轨迹

  1. trace_explorer() ## Social network analysis

可以使用该resource_map功能创建工作移交网络。它具有与函数相同的参数process_map。

Dashboards

processmonitR 包提供了几个预定义的仪表板来监控基于事件日志的进程。它们可以使用以下功能启动:

  1. performance_dashboard:查看吞吐量时间、处理时间、空闲时间
  2. activity_dashboard:查看活动频率和存在
  3. rework_dashboard:查看返工(自循环,重复)
  4. resource_dashboard:查看资源频率、参与度、专业化程度